-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: More detailed tracing for distributors #14504
base: main
Are you sure you want to change the base?
Conversation
@@ -594,36 +601,40 @@ func (d *Distributor) Push(ctx context.Context, req *logproto.PushRequest) (*log | |||
const maxExpectedReplicationSet = 5 // typical replication factor 3 plus one for inactive plus one for luck | |||
var descs [maxExpectedReplicationSet]ring.InstanceDesc | |||
|
|||
tracker := pushTracker{ | |||
kafkaTracker := pushTracker{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that is unclear to me is what was the behavior before when sharing a common tracker, and what is the behavior now having separate trackers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API behaviour is exactly the same. The only benefit of the separate trackers is that we can monitor when all the ingesters complete vs when all the kafka writes complete. Before, the tracker only fired "done" once both sets of writes had completed and there was no way to tell which one was faster.
Initially I did this so we could create a histogram from the timings, but I've since replaced it with Spans. We do get subspans from the ingesters once we hit their Push endpoint, but there are no sub-spans from writing to Kafka which is what I wanted to amend here.
In general LGTM, but a couple questions before I ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but it's the first time I reviewed tracing instrumentation so take it with pinch of salt
d619097
to
f72afaf
Compare
What this PR does / why we need it:
Improves the tracing in the distributor. Currently we just get a single span for the whole request.