Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify distribution summary docs #182

Merged
merged 4 commits into from
Aug 21, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions docs/spectator/core/meters/dist-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ via a filter, or use one of the operators below to generate a useful response.

### Average Measurement (:dist-avg)

To compute the average latency across an arbitrary group, use the [:dist-avg] function:
To compute the average size across an arbitrary group, use the [:dist-avg] function:

@@@ atlas-stacklang
/api/v1/graph?q=nf.cluster,foo,:eq,name,http.req.payload.size,:eq,:and,:dist-avg,(,nf.asg,),:by
Expand All @@ -38,12 +38,17 @@ To compute the average latency across an arbitrary group, use the [:dist-avg] fu

### Maximum Measurement (:dist-max)

To compute the maximum latency across a group, use [:dist-max]:
To compute the maximum size across a group, use [:dist-max]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.


@@@ atlas-stacklang
/api/v1/graph?q=nf.cluster,foo,:eq,name,http.req.payload.size,:eq,:and,:dist-max,(,nf.asg,),:by
@@@

!!! Note
Distribution summaries do not aggregate well over dimensions. Each measurement is recorded with a mapping of tags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really have anything to do with :dist-max specifically, it should probably be moved to the initial section or possibly just as a warning in the IPC spec for the inflight stuff. Distribution summaries aggregate just fine is used as intended, i.e. summarizing recorded samples such as payload sizes for requests. The :dist-max will be the maximum recorded sample that matches the query criteria. The :dist-avg will be the average of the recorded samples that match the query criteria. Possible rewording:

Distribution summaries are intended for use-cases tracking the size of recorded samples. For example, request payload sizes to understand the average or maximum recorded payload size for a given scope. It should not be used for use-cases trying to perform hierarchical accumulation such as number of inflight requests across different levels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to the IPC spec for inflight metrics, and modified the wording to include some of your rewording. I don't want to make it sound like we shouldn't use distribution summary at all for inflight metrics, since we aren't deprecating it (for now). I still think it can be useful for customers to see the dist-max on certain filters, but we've added new dashboards for queueing approximation.

Therefore, when using [:dist-max] to query over a set of filters, the response will represent the maximum size for a
given set of tag values within the group, **not** the accumulate value of measurements across the entire group.

[:dist-max]: ../../../asl/ref/dist-max.md

### Standard Deviation of Measurement (:dist-stddev)
Expand Down