-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce retained memory use #280
Conversation
Trade-off here is a lot more small allocations (for each cells) to be able to reuse the received buffers. I think this need more than a benchmark on the isolated function, because the benchmark will show positive result but actual usage of gohbase that is processing the results as one may be negatively impacted by the increased allocations rate. |
Yeah, we are doing some testing now on a real cluster and I would like to write some more tests/benchmarks. Some thoughts:
|
Ensure that no buffer memory is retained when serializing cell blocks. This does add an 1 allocation to cellFromCellBlock, which makes it slower, but we are already allocating 3 times in this function due to the protobuf structs. And more importantly this enables an optimization in the region client to be able to reuse buffers. goos: darwin goarch: arm64 pkg: github.com/tsuna/gohbase/hrpc cpu: Apple M4 Pro │ before.txt │ after.txt │ │ sec/op │ sec/op vs base │ DeserializeCellBlocks/1-14 51.97n ± 1% 74.60n ± 1% +43.53% (p=0.000 n=10) DeserializeCellBlocks/100-14 4.160µ ± 1% 6.249µ ± 1% +50.22% (p=0.000 n=10) geomean 465.0n 682.8n +46.84% │ before.txt │ after.txt │ │ B/op │ B/op vs base │ DeserializeCellBlocks/1-14 200.0 ± 0% 232.0 ± 0% +16.00% (p=0.000 n=10) DeserializeCellBlocks/100-14 19.62Ki ± 0% 22.75Ki ± 0% +15.92% (p=0.000 n=10) geomean 1.958Ki 2.270Ki +15.96% │ before.txt │ after.txt │ │ allocs/op │ allocs/op vs base │ DeserializeCellBlocks/1-14 4.000 ± 0% 5.000 ± 0% +25.00% (p=0.000 n=10) DeserializeCellBlocks/100-14 301.0 ± 0% 401.0 ± 0% +33.22% (p=0.000 n=10) geomean 34.70 44.78 +29.05%
Reuse buffers for reading responses from HBase, both for the compressed response and decompressed response. This should reduce memory pressure when reading lots of data.
I added a benchmark of the changes and the results don't look that promising. It's possible that my inputs for the test are not fully representative.
Given the small improvement to bytes allocated, we probably don't want to continue pursuing this change. Though, it does still have the advantage that each response has its own heap allocation and doesn't retain pointers to other responses. |
Dropping this PR because of the lackluster benchmarking results. I'll start a new PR to add the receive benchmark. |
Change deserialization of HBase responses to not retain any memory from the read buffer. This allows reusing read buffers.