Last night, I optimized a module I wrote in golang and got more significant results than I expected, so I’m documenting them here.
-
When declaring a slice, you can specify the initial capacity of its backing array via the
capargument. When the eventual size of the slice is known in advance, this can reduce array reallocations. I went through the code and addedcapwherever I could, but profiling didn’t show any noticeable performance change. -
Out of OOP habit, I had some logic that could have been written as functions but had instead been written as structs. I changed them to functions to avoid unnecessary heap usage. Again, there was no noticeable difference.
-
Reduced the number of memory allocations by using
unsafe.Pointerwhen converting byte slice to string. -
I had code that cached frequently used entities on disk, but I changed it to an in-memory cache. The cached objects were small enough to justify keeping them in memory. That removed the need to serialize them every time and led to a significant reduction in CPU usage.
-
I applied the automaxprocs library created by Uber.
-
I replaced the rate limiting library I had been using. I had originally been using one created by Uber, but when I took a closer look at the algorithm, I realized it didn’t match our requirements. After replacing it, we saw a significant increase in throughput.
-
I tried
sync.Poolon some frequently created structs, but it didn’t seem to improve performance as much as I thought it would, and I could see where it could cause problems if handled incorrectly in the future, so I reverted back.