Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory allocation in appendRune function #1988

Merged

Conversation

mateusveloso
Copy link
Contributor

@mateusveloso mateusveloso commented Nov 6, 2023

This pull request addresses a potential memory leak in the appendRune function of the go-pg library. The current implementation of the function creates a new byte slice every time the capacity of the current slice is insufficient to store the new rune. This can lead to a large number of memory allocations, especially when the function is called with large strings.

The proposed change modifies the appendRune function to reuse the already allocated memory as much as possible, instead of always allocating new memory. This is achieved by doubling the capacity of the existing slice when the current capacity is insufficient.

This change should help mitigate the memory leak issue and improve the performance of applications using the go-pg library.

Here is the modified version of the appendRune function:

func appendRune(b []byte, r rune) []byte {
    if r < utf8.RuneSelf {
        return append(b, byte(r))
    }
    l := len(b)
    if cap(b)-l < utf8.UTFMax {
        nb := make([]byte, l, 2*l+utf8.UTFMax)
        copy(nb, b)
        b = nb
    }
    n := utf8.EncodeRune(b[l:l+utf8.UTFMax], r)
    return b[:l+n]
}

This PR resolves #1987

@elliotcourant
Copy link
Collaborator

From my understanding of the append rune function, this would only improve memory allocation when you are frequently appending runes that are multiple bytes by pre-allocating some space for those runes. But if you are infrequently doing this then you would be over-allocating.

The current code allocates exactly what is needed but would do it far more frequently (and potentially inefficiently if you are frequently using multi-byte runes), the new code solves this but I don't interpret it as a fix but more as a tradeoff for a specific use case?

Can you provide more details or tests that demonstrate the memory behavior that you were observing with the old code and how those are resolved with the new approach?

@elliotcourant
Copy link
Collaborator

I would be particularly interested in a unit test for this function that could be run on the primary branch and this one with a memprofile so that we could compare.

@mateusveloso
Copy link
Contributor Author

Absolutely, I understand the importance of providing empirical evidence to support this change. I have set up a unit test for the appendRune function that can be run with a memory profiler on both the primary branch and this one. This allows us to compare the memory usage of both implementations.

The benchmark test was designed to stress the appendRune function by repeatedly appending a multi-byte rune to a byte slice. The test was run with a large number of iterations to simulate a high-load scenario and make any memory leak more evident.

Here are the results of the benchmark test:

  • Old implementation: BenchmarkAppendRuneOld-16 136 8829282 ns/op 16790812 B/op 30 allocs/op
  • New implementation: BenchmarkAppendRuneNew-16 168 7076402 ns/op 8456451 B/op 12 allocs/op

As you can see, the new implementation of the appendRune function allocates significantly less memory per operation (8.45 MB vs 16.79 MB) and makes fewer distinct allocations (12 vs 30). This suggests that the new implementation is more efficient in terms of memory allocation, which can help mitigate the memory leak issue.

I believe these results demonstrate the effectiveness of the proposed change in improving the performance of the appendRune function. I hope this addresses your concerns, and I'm happy to provide any further information or make any additional changes as needed.

@elliotcourant

@elliotcourant
Copy link
Collaborator

@mateusveloso outstanding! Thank you very much for putting that together, I'm gonna play around with it locally a bit tonight with non-multibyte runes and look at a difference in allocation there as well. But so far this looks good to me!

Out of curiosity, was this discovered due to a use case in your app with primarily multi-byte runes from user input? Names, descriptions, metadata etc?

@mateusveloso
Copy link
Contributor Author

@elliotcourant I discovered this issue while working on my application, which handles a significant amount of user input, including multi-byte runes in names, descriptions, metadata, etc.

I noticed an unusual memory usage pattern that led me to suspect a potential memory leak. To investigate further, I utilized the go tool pprof.

Upon analysis, I found that the appendRune function was a significant contributor to the memory usage. Given the frequency and volume of data my application processes, optimizing this function seemed like a promising avenue for improving overall performance and efficiency.

I'm glad to hear that the changes look good to you so far. I'm confident that they will prove beneficial not only for my application but also for other developers using the go-pg library. I look forward to hearing about your experiences testing it with non-multibyte runes.

@elliotcourant
Copy link
Collaborator

@mateusveloso I'm fine to merge this, would you mind removing the benchmark for the "old" version and the old function? But leave the benchmark around the new one in?

@mateusveloso
Copy link
Contributor Author

mateusveloso commented Nov 8, 2023

@elliotcourant
Absolutely, I appreciate the feedback! I'll go ahead and remove the benchmark for the old version and the old function as suggested.

@mateusveloso
Copy link
Contributor Author

@elliotcourant I've removed the old benchmark

@elliotcourant elliotcourant merged commit b93c0c8 into go-pg:v10 Nov 8, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential Memory Leak in appendRune Function
2 participants