Last night, I optimized a module I wrote in golang and got more significant results than I expected, so I’m documenting them here.

  1. When declaring a slice, We can specify the initial size of the array inside the slice via the cap parameter. In situations where the size of the slice is known in advance, this can minimize array reallocation. I went through the entire code and added the cap parameter wherever I could. However, profiling didn’t show any noticeable performance changes.

  2. Out of habit from OOP, I had some logic that could have been declared as a function but was instead declared as a structure. I changed them to functions to avoid unnecessary heap memory usage. Again, there were no noticeable changes.

  3. Reduced the number of memory allocations by using unsafe.Pointer when converting byte slice to string.

  4. I had code that was caching frequently used entities on disk, but I changed it to memory. I realised that the size of the cached entities was enough to justify putting them in memory. This eliminated the need to serialise them every time, and I saw a significant reduction in CPU usage.

  5. I applied the automaxprocs library created by Uber.

  6. I replaced the library I was using for ratelimit. I was using one originally created by uber, but when I took a closer look at the algorithm, I realised that it didn’t match our requirements. We saw a significant increase in throughput.

  7. I tried sync.Pool on some frequently created structs, but it didn’t seem to improve performance as much as I thought it would, and I could see where it could cause problems if handled incorrectly in the future, so I reverted back.