Implementing memory management with Golang's garbage collector

Did you ever think of how bulk messages are pushed in real-time that fast? How is it possible? Low latency garbage collector (GC) plays an important role in this. In this article, we present ways to look at certain parameters to implement memory management with the Golang GC.

Garbage collection is the process of freeing up memory space that is not being used. In other words, the GC sees which objects are out of scope and cannot be referenced anymore and frees the memory space they consume. This process happens in a concurrent way while a Go program is running and not before or after the execution of the program.

This article is an excerpt from the book Mastering Go - Third Edition by Mihalis Tsoukalos. Mihalis runs through the nuances of Go, with deep guides to types and structures, packages, concurrency, network programming, compiler design, optimization, and more.

Implementing the Golang GC

The Go standard library offers functions that allow you to study the operation of the GC and learn more about what the GC does secretly. These functions are illustrated in the gColl.go utility. The source code of gColl.go is presented here in chunks.

Package main

import (
   "fmt"
   "runtime"
   "time"
)

You need the runtime package because it allows you to obtain information about the
Go runtime system, which, among other things, includes the operation of the GC.

func printStats(mem runtime.MemStats) {
    runtime.ReadMemStats(&mem)
    fmt.Println("mem.Alloc:", mem.Alloc)
    fmt.Println("mem.TotalAlloc:", mem.TotalAlloc)
    fmt.Println("mem.HeapAlloc:", mem.HeapAlloc)
    fmt.Println("mem.NumGC:", mem.NumGC, "\n")
}

The purpose of the printStats() function is to avoid writing the same Go code all the time. The runtime.ReadMemStats() call gets the latest garbage collection statistics
for you.

func main() {
   var mem runtime.MemStats
   printStats(mem)

   for i := 0; i < 10; i++ {
       // Allocating 50,000,000 bytes
       s := make([]byte, 50000000)
       if s == nil {
           fmt.Println("Operation failed!")
         }
   }
   printStats(mem)

In this part, we have a for loop that creates 10-byte slices with 50,000,000 bytes each.
The reason for this is that by allocating large amounts of memory, we can trigger the
GC.

  for i := 0; i < 10; i++ {
      // Allocating 100,000,000 bytes
      s := make([]byte, 100000000)
      if s == nil {
          fmt.Println("Operation failed!")
      }
      time.Sleep(5 * time.Second)
   }
   printStats(mem)
}

The last part of the program makes even bigger memory allocations – this time, each byte slice has 100,000,000 bytes.

Running gColl.go on a macOS Big Sur machine with 24 GB of RAM produces the following kind of output:

$ go run gColl.go
mem.Alloc: 124616
mem.TotalAlloc: 124616
mem.HeapAlloc: 124616
mem.NumGC: 0

mem.Alloc: 50124368
mem.TotalAlloc: 500175120
mem.HeapAlloc: 50124368
mem.NumGC: 9

mem.Alloc: 122536
mem.TotalAlloc: 1500257968
mem.HeapAlloc: 122536
mem.NumGC: 19

The value of mem.Alloc is the bytes of allocated heap objects — allocated are all the objects that the GC has not yet freed. mem.TotalAlloc shows the cumulative bytes allocated for heap objects—this number does not decrease when objects are freed, which means that it keeps increasing. Therefore, it shows the total number of bytes allocated for heap objects during program execution. mem.HeapAlloc is the same as mem.Alloc. Last, mem.NumGC shows the total number of completed garbage collection cycles. The bigger that value is, the more you have to consider how you allocate memory in your code and if there is a way to optimize that.

If you want even more verbose output regarding the operation of the GC, you can
combine go run gColl.go with GODEBUG=gctrace=1. Apart from the regular program
output, you get some extra metrics—this is illustrated in the following output:

$ GODEBUG=gctrace=1 go run gColl.go
gc 1 @0.021s 0%: 0.020+0.32+0.015 ms clock, 0.16+0.17/0.33/0.22+0.12 ms
cpu, 4->4->0 MB, 5 MB goal, 8 P
gc 2 @0.041s 0%: 0.074+0.32+0.003 ms clock, 0.59+0.087/0.37/0.45+0.030
ms cpu, 4->4->0 MB, 5 MB goal, 8 P
.
.
.
gc 18 @40.152s 0%: 0.065+0.14+0.013 ms clock, 0.52+0/0.12/0.042+0.10 ms
cpu, 95->95->0 MB, 96 MB goal, 8 P
gc 19 @45.160s 0%: 0.028+0.12+0.003 ms clock, 0.22+0/0.13/0.081+0.028
ms cpu, 95->95->0 MB, 96 MB goal, 8 P
mem.Alloc: 120672
mem.TotalAlloc: 1500256376
mem.HeapAlloc: 120672
mem.NumGC: 19

Now, let us explain the 95->95->0 MB triplet in the previous line of output. The
first value (95) is the heap size when the GC is about to run. The second value (95)
is the heap size when the GC ends its operation. The last value is the size of the
live heap (0).

Go garbage collection is based on the tricolor algorithm

The operation of the Go GC is based on the tricolor algorithm, which is the subject of this subsection.

Note that the tricolor algorithm is not unique to Go and can be used in other programming languages as well.

Strictly speaking, the official name for the algorithm used in Go is the tricolor mark-and-sweep algorithm. It can work concurrently with the program and uses a write barrier. This means that when a Go program runs, the Go scheduler is responsible for the scheduling of the application and the GC. This is as if the Go scheduler has to deal with a regular application with multiple goroutines!

The core idea behind this algorithm came from Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens and was first illustrated in a paper named On-the-Fly Garbage Collection: An Exercise in Cooperation.

The primary principle behind the tricolor mark-and-sweep algorithm is that it divides the objects of the heap into three different sets according to their color, which is assigned by the algorithm. It is now time to talk about the meaning of each color set. The objects of the black set are guaranteed to have no pointers to any object of the white set. However, an object of the white set can have a pointer to an object of the black set because this has no effect on the operation of the GC. The objects of the gray set might have pointers to some objects of the white set. Finally, the objects of the white set are the candidates for garbage collection.

So, when the garbage collection begins, all objects are white, and the GC visits all the root objects and colors them gray. The roots are the objects that can be directly accessed by the application, which includes global variables and other things on the stack. These objects mostly depend on the Go code of a program.

After that, the GC picks a gray object, makes it black, and starts looking at whether
that object has pointers to other objects of the white set or not. Therefore, when an
object of the gray set is scanned for pointers to other objects, it is colored black. If
that scan discovers that this particular object has one or more pointers to a white
object, it puts that white object in the gray set. This process keeps going for as long
as objects exist in the gray set. After that, the objects in the white set are unreachable
and their memory space can be reused. Therefore, at this point, the elements of the
white set are said to be garbage collected. Please note that no object can go directly
from the black set to the white set, which allows the algorithm to operate and be able
to clear the objects on the white set. As mentioned before, no object of the black set
can directly point to an object of the white set. Additionally, if an object of the gray
set becomes unreachable at some point in a garbage collection cycle, it will not be
collected at this garbage collection cycle but in the next one! Although this is not an
optimal situation, it is not that bad.

During this process, the running application is called the mutator. The mutator runs a small function named write barrier that is executed each time a pointer in the heap is modified. If the pointer of an object in the heap is modified, which means that this object is now reachable, the write barrier colors it gray and puts it in the gray set. The mutator is responsible for the invariant that no element of the black set has a pointer to an element of the white set. This is accomplished with the help of the write barrier function. Failing to accomplish this invariant will ruin the garbage collection process and will most likely crash your program in a pretty bad and undesirable way!

So, there are three different colors: black, white, and gray. When the algorithm
begins, all objects are colored white. As the algorithm keeps going, white objects are
moved into one of the other two sets. The objects that are left in the white set are the
ones that are going to be cleared at some point.

The next figure displays the three color sets with objects in them.

implementing-memory-management-with-golang-garbage-collector-img-0

Figure 1: The Go GC represents the heap of a program as a graph

In the presented graph, you can see that while object E, which is in the white set, can access object F, it cannot be accessed by any other object because no other object points to object E, which makes it a perfect candidate for garbage collection! Additionally, objects A, B, and C are root objects and are always reachable; therefore, they cannot be garbage collected.

Graph comprehended

Can you guess what will happen next in that graph? Well, it is not that difficult to realize that the algorithm will have to process the remaining elements of the gray set, which means that both objects A and F will go to the black set. Object A will go to the black set because it is a root element and F will go to the black set because it does not point to any other object while it is in the gray set.

After object A is garbage collected, object F will become unreachable and will be garbage collected in the next cycle of the GC because an unreachable object cannot magically become reachable in the next iteration of the garbage collection cycle.

Note: The Go garbage collection can also be applied to variables such as channels. When the GC finds out that a channel is unreachable, that is when the channel variable cannot be accessed anymore, it will free its resources even if the channel has not been closed.

Go allows you to manually initiate a garbage collection by putting a runtime.GC() statement in your Go code. However, have in mind that runtime.GC() will block the caller and it might block the entire program, especially if you are running a very busy Go program with many objects. This mainly happens because you cannot perform garbage collections while everything else is rapidly changing, as this will not give the GC the opportunity to clearly identify the members of the white, black, and gray sets. This garbage collection status is also called garbage collection safe-point.

You can find the long and relatively advanced Go code of the GC at https://github.com/golang/go/blob/master/src/runtime/mgc.go, which you can study if you want to learn even more information about the garbage collection operation. You can even
make changes to that code if you are brave enough!

Understanding Go Internals: defer, panic() and recover() functions [Tutorial]

Implementing hashing algorithms in Golang [Tutorial]

Is Golang truly community driven and does it really matter?