-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pausing and resuming consumers #4966
Comments
Should delay just be a parseable string? "1s", "2h"? If we can't parse we return an error. Do we want to have maximum and minimums or start simple and add in limits as needed? |
We don’t have other cases of such strings in the API it’s also a bit go centric so Duration seems best and let UIs handle it as they wish be it strings like that in CLI or some kind oh picker on web let’s start simple. |
ok, but if we use time.Duration then its nanos, not millis.. But I hear you on consistency.. |
Indeed - nanos. Will fix. |
@neilalexander and @Jarema could you work with @ripienaar and this writeup and schedule this work? |
@derekcollison this has been scheduled to start on the 5th of February, with a plan to finish before the 16th of February. @neilalexander will be working on it. |
@ripienaar @neilalexander Can I ask for an update of the final design after recent discussions? |
from my perspective I think the pause/resume APIs are still the right direction. Details for how we actually implement that in a way thats not massive plumbing in the server is for @neilalexander to comment |
I vote it should just be part of the consumer config, with no new API endpoints. |
At this point I'd say lets just not add this feature. We can go back and find requirements. As it stands the few requirements we do have will not be met without these extra APIs, so lets just close the issue and move on. |
I thought it would be easier but not impossible, you are saying they would require securing just that functionality vs general update yes? And without general callouts we only have new APIs to secure individually, that correct? |
Yes, I think there is a need to cater for 2 distinct users - operational needs and configuration needs. Often configuration may not be changed without approvals by change advisory boards etc. Doing maintenance should not require a configuration change. Those doing maintenance should not need to be authorized to do a configuration change. |
Capturing a discussion that keeps coming up around this one: Question: Should the paused until configuration be updatable as configuration? Given this pattern the question is who owns this property? If an administrator sets the pause state to x and the app starting up sets it to start-paused or unpaused, how is the system to distinguish between a normal app making the API call to create a paused/unpaused consumer and a admin asking the consumer to be paused? I dont think the API has the context of who is calling it for what reason and it would be undesirable to allow a unexpected config update by a starting worker to unpause a consumer. It's essential that the responsibilities of creation and administration be seperate here, it could be created paused - but a administrator must be able to unpause it and know if that creation is run again it will not again be paused. Or if an administrator overrides the pause from 1 hour to 10 minutes that a service startup does not again set it back to 1 hour. I cant think of a way to capture this distinction (except maybe (ab)using the |
Related server PR #5066 |
Server PR has been merged 🎉 |
Proposed change
Introduce an API on
$JS.API.CONSUMER.PAUSE.*.*
that takes as request:The consumer will set itself in a paused state but continue to handle acks for in-flight messages. No further message deliveries will be done after this point, other than deliveries being inhibited the consumer functions as usual.
If a delay is given a timer will auto-resume the consumer. If no time or a time in the past is given a paused consumer will resume.
Consumer info includes 2 new fields:
The paused state and time time would need to be persisted to the raft layer such that server restarts would not unpause paused consumers. This is done using the consumer configuration that has a new value:
When given at create time this creates a paused consumer, it's not updatable at runtime using a configuration update, but the PAUSE api will update this setting. Essentially the only way to change this post-create is with the PAUSE API.
Advisories for pause and unpause to be added on
io.nats.jetstream.advisory.v1.consumer_pause
with pertinant infoUse case
It is difficult to schedule maintenance on central resources on a large distributed system where 100s or 1000s of clients are accessing data in a stream.
We would like to be able to pause a Consumer such that it appears healthy but just doesnt deliver any messages.
During the pause maintenance can happen and resources accessed by clients will not be under constant pressure, later the stream can be unpaused and work will continue.
This would happen without impacting running clients - other than they would see pending messages in stream info but not get any deliveries.
This would apply to push and pull consumers.
Contribution
No response
The text was updated successfully, but these errors were encountered: