Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update backpressure implementation with runtime configurable threshold
Prior to this commit, the backpressure implementation would unschedule muted actors immediately after they sent a single message to a muted/overloaded/under pressure actor. While this works reasonably well, it does have some rough edges as noted in #3382 and #2980. This commit updates the backpressure implementation that can have its threshold for when to unschedule an actor sending to muted/overloaded/under pressure actors be dynamically controlled via a runtime cli argument. It does this by separating an actor being muted from when an actor is unscheduled where an actor is still muted immediately upon sending a message to a muted/overloaded/under pressure actor but the threshold controls when an actor is unscheduled after it has been muted resulting in a more gradual backpressure system. This threshold based backpressure should result in less stalls and more overall forward progress due to allowing muted actors to make some slow progress prior to being uncheduled. The updated backpressure logic is as follows (at a high level): * If sending a message to a muted/overloaded/under pressure actor, note that the sending actor should mute. * At the end of the behavior, mark the actor as `muted` if it is not already `muted` and save the receiver actor info. * At the end of the behavior, check to see if the actor is over the threshold. * If yes, unschedule actor and mark it as `muted and unscheduled` and finally add the sender and all muters to the scheduler mutemap (as used to happen previously after 1 message) * If no, stop processing messages but don't unschedule the actor * When an actor is no longer overloaded or under pressure, it will tell the schedulers to unmute all senders that were muted and unscheduled as a result of sending messages to it (same as previously) * When an actor runs it checks if it still needs to stay `muted` or not and if not, it tells the schedulers to unmute all senders that were muted and unscheduled as a result of sending messages to it * The actor also optimistically unmutes itself it successfully processes a full batch or all messages in its queue and tells the schedulers to unmute all senders that were muted and unscheduled as a result of sending messages to it The updated logic is functionally equivalent to the old backpressure implentation when the threshold is `1` message. There is now a new command line argument to control the threshold for throttling/muting called `--ponymsgstilmute`. This option is specifically added for power users who want to have more control over how the runtime handles backpressure. Using a value of `1` for `--ponymsgstilmute` will result in actors getting muted after sending a single message to a muted/overloaded/under pressure actor and be functionally equivalent to the previous backpressure implementation that did not employ gradual throttling. The threshold defaults to `10`.
- Loading branch information