Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kgo-repeater: rebuild client on consumer errors #6

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Aug 31, 2022

This is parked here in case we need it. It was part of debugging for #5959 to eliminate the possibility that the workers might have been okay if they just restarted on errors, but in that issue it turned out that workers were really getting stuck without any actual consume errors.

Some errors might put client in a stuck state where
it needs to be torn down and recreated. These are
liable to be bugs (hence exposing in a counter for
tests to check), but we do not want them to manifest
as members mysteriously hanging: we should drop out
and rebuild the client so that we continue to generate load.

jcsp added 2 commits August 31, 2022 13:54
Give clients helpful names that simplify debugging
when a particular worker has an issue.  The name shows
up in:
- kgo-repeater -debug log
- kgo-repeater -trace log (franz-go logs)
- redpanda server as the client ID

Related: redpanda-data/redpanda#5959
If a consumer gets an unrecoverable error
(e.g. an unknown_server_error) that leaves the
client library in a confused state, we may
need to totally tear it down and make a new one.
@jcsp jcsp marked this pull request as draft August 31, 2022 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant