Golang. Shaping web-service memory consumption.

Yuri Fenyuk
5 min readApr 30, 2024


Golang runtime is quite lean on compute resource usage. This is perfect on its own as it allows the majority of developers to stay concentrated on coding tasks. And only later, when containerized web-service has reached an execution environment crowded with many other containers fighting for fractions of available compute resources, such a ‘real-life’ becomes painful. What can be quickly done to limit the appetite for Golang's written application is the plan for the article below.

In one of the recent Goland 1.19 release, the execution runtime was enriched with runtime/debug.SetMemoryLimit function which respects the value of GOMEMLIMIT environment variable. GOMEMLIMIT allows specifying a soft memory cap. It does not replace GOGC but works in conjunction with it. It is still OK to set GOGC for a scenario in which memory is always available. And at the same time, setting GOMEMLIMIT automatically makes the GC more aggressive when memory is scarce.

To play with different values for the mentioned environment variable, test Golang web-service, a.k.a “squirrel in the wheel” is needed:

Web-server is run on port 8080 and has only one endpoint /hash with a simple handler function:

To emulate a workload, the function calculates hashsum and returns it to the client. In turn, hashing looks like this:

#2: load binary files in full (in my case, I use 100+MiB one) and return slice of bytes

#6..7: send bytes to the hash512 instance class for hashing

#8..0: adding random number up to one million to make hashing results unique

That was the only compute functionality emulation.

In addition to it, the package expvar is added also. This package adds one more endpoint /debug/vars, to return web-service up-to-date memory statistics. It was enough just to import this package as the default behavior is sufficient:

The complete web-service source code can be found on github repo.

For load testing the Grafana’s k6 tool is utilized. Please go through ‘getting started‘ example to get familiar with it.

The plan is to put /hash endpoint under some pressure by sending multiple parallel requests to it. Other endpoint /debug/vars will be called once per second to collect up-to-date memory footprint.

The test cycle consists of 100 virtual users sending parallel requests for 30 seconds:

Entry function for test:

#2: invoke /hash endpoint with 100 parallel requests

#5: deliver web-service memory situation to client

#7..#8: push results into custom metrics for k6

#12: relax workload for 1 second

Getting Golang web-service memory details:

#2: talk to /debug/vars endpoint which returns Golang’s runtime.Memstats. Refer package docs for details

#5..7: extract a few parameters (explanation is later), converting to megabytes where appropriate, for analysis

The full k6 load testing file can be found on github repo. Running test is simple with the command “k6 run script.js”.
My laptop shows something like below (please do not forget to run web-service beforehand):

The final part is to conduct the set of experiments by hammering Golang written web-server with load test k6 plan and changing Golang memory parameters.

My laptop is Macbook Air 2 with 24GiB physical memory, macOS Sonoma v.14.4.1, and Golang v.1.22 installed. For every test, the new terminal session where web-service run is created.

Besides performance test standard metric requests-per-second, the following memory-related metrics, which mainly come from runtime package MemStats, are taken into account:

#MemStats.Alloc: bytes of allocated heap objects (see official docs for details). This parameter is interesting to observe during the whole session length, but a glance at the average value (converted in Mib) is a good simplification

#MemStats.TotalAlloc: cumulative bytes allocated for heap objects. The final value converted in MiB is only interested

#MemStats.Sys: measures the virtual address space reserved by the Go runtime for the heap, stacks, and other internal data structures. This parameter is interesting to observe during the whole session length, but a glance at the average value (converted in Mib) is a good simplification

#MemStats.NumGC: number of completed GC cycles. The final value is interested

Experiment #1. Without GC at all:

#1: Add parameter to see details of GC runs in web-server terminal session

#2: Disable GC at all. The default parameter value is 100 (in percentage), meaning the current threshold increases when GC is activated next time.

#3: run web-server, before each test cycle.

The three test cycles, with web-service restarts in new terminal session in between, plus an average of the mentioned metrics are below:

GC is off

#24: during 30 secs., 878 requests are processed, which is equal to 25 requests per second
#21: GC has never been invoked, since it is off
#20, #23: Heap-allocated memory takes almost all Sys-allocated memory, which makes sense, as heap has never been released by GC, so heap is only grows and pushes Sys-allocated memory constantly up

Experiment #2. GOMEMLIMIT set to 2 GiB:

#2: although GC is still off, the combination with GOMEMLIMIT makes GC work anyway (see official docs for details)
#3: setting GOMEMLIMIT to 2GiB as a soft limit makes Golang runtime run memory cleanup in the appropriate moment when consumed memory is somewhere near it

GOMEMLIMIT is set to 2GiB

#20: Heap allocation is 1.5GiB on average (although there was peak much more) which correlates with soft limit to 2GiB

#21: GC ran 125 times, when, apparently, allocated memory was crossing 2GiB

#22: Sys-allocated memory was close to 8CiB in average
#24: web-server was capable of handling 68 (3 times more than before) requests per second, due to reusing no longer needed memory and not constantly spending compute resources on new memory pages memory allocation

Experiment #3. GOMEMLIMIT set to 5GiB:

The output statistics:

GOMEMLIMIT is set to 5GiB

#20: Heap allocation is 3.1GiB on average, which is noticeably higher than in previous experiment, but falls into soft-limit

#21: GC ran 42 times only since the memory soft limit is higher

#22: Sys-allocated memory is also higher reacting to potential Heap-allocated memory limit grows

#24: web-service productivity is close to the previous experiment. Apparently, the slowest part is no longer memory, but other OS resources (disk reads, CPU computation, etc.)

All necessary code is available on github repo.

The experiments have shown the importance of setting GOMEMLIMIT variable to limit (well… soft-limit) the appetite of Golang applications. It is a precious option for containerized applications in multi-tenants / multi-environment clusters, which is almost the case nowadays. The old-and-good GOGC variable still works even in combination with GOMEMLIMIT, but the latter seems to be more important by now.