-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing default allocation size in channel/mod.rs
#346
Comments
There is not, but we could certainly look in to something smarter. Do you have a repro that is easy to share, or another way on hand to assess what effect a change would have? The "natural" thing to do would be to start small and scale up by 2x to some limit, which I've coded up before but never had the examples to test the trade-offs. |
I don't have a simple repro. I'm just looking at the heap profile generated by massif. I am going to create a fork of timely and differential and play with different values of this constant to evaluate the memory/performance tradeoff. I will share the findings here. |
Some more thoughts (trying out a new default is a good idea, btw):
|
Thanks, @frankmcsherry !
I've so far only confirmed that reducing the parameter to 128 does reduce our memory footprint significantly (at least according to Also, regarding the plan to grow the buffer on-demand, I wonder if it should also shrink when unused to avoid gradually running out of memory over time. |
The Naiad take on things was that when an operator was not in use it would de-allocate its send buffers. This is worse from a steady state operation point of view, but cuts the standing memory costs of inert dataflows. So, plausibly the buffers grow geometrically each time they fill, and then as part of flushing the channel are wholly de-allocated (starting the geometric growth from scratch the next time the operator is scheduled). But, I need to look in to things and see which of these signals are clearest to send. |
I did some more measurements. I see no performance degradation with default buffer size 128. This is probably very specific to my use case that tends to deal with small batches of large objects. While an adaptive buffer size would be awesome, it looks like the ability to set a fixed but user-defined buffer size through the API would do the trick for me. For the time being, I am using a private fork of the repo where I changed the constant to 128. |
I am creating a large dataflow graph with ~10000 nodes. It works great, except that I noticed that the static memory footprint of the program (before I feed any data to it) keeps growing as the graph gets bigger. I think it is currently in the order of a 100MB. This does not sound too bad, but my application creates thousands of instances of the dataflow, at which point this overhead becomes significant, and in fact dominates the memory footprint of the program. It appears that the main contributor are the buffers of size
Message::default()
(currently equal to 1024), pre-allocated for all channels. Is there a way to change this default without forking the repo?The text was updated successfully, but these errors were encountered: