-
Notifications
You must be signed in to change notification settings - Fork 597
Limit size of HeronTupleSet. #2253
base: master
Are you sure you want to change the base?
Conversation
@@ -431,7 +433,15 @@ void StMgrServer::HandleTupleSetMessage(Connection* _conn, | |||
->incr_by(_message->control().fails_size()); | |||
} | |||
stmgr_->HandleInstanceData(iter->second, instance_info_[iter->second]->local_spout_, _message); | |||
__global_protobuf_pool_release__(_message); | |||
auto message_size = _message->ByteSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ByteSizeLong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find ByteSizeLong
API in generated code. Let me try again...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems this is introduced in 3.1.0 and then deprecated in 3.4.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, ByteSize calculates size of serialized message, which seems will be slightly larger than actual size in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if the message is 60MB but only little part of it is used to hold the incoming message (because incoming message is small), we will not delete it because I believe serialized size is much smaller than 60MB. i'm not sure about
Did some experiment:
gives us
|
Thats interesting obs. So ByteSize is probably more related to wire format size and SpaceUsed is related to actual in memory repr. In that case, shouldn't you use SpaceUsed? But isn;t that very slow? |
@srkukarni It seems that we have no choice but to use |
If the performance is a concern, I would suggest the old method: put it in mempool and run a garage collection against mempool to remove large tuples every 1 min. |
Some benchmarking for
|
Could you also share the thruput figures? Particularly in exclamation or other topologies that are used for Heron benchmarking? |
word count, parallelism=20 Stmgr CPU user time doubled, Data Tuples from Instances dropped ~25%. We need to fix #1908 first to see more metrics. |
Fix #2234. We limit the size of
HeronTupleSet
. If it is larger than the maximum size, we release it back to allocator instead of memory pool.Tested on local machine.