Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: Add topic and expand info for in-memory queue #13246

Merged
merged 6 commits into from
Oct 5, 2021

Conversation

karenzone
Copy link
Contributor

@karenzone karenzone commented Sep 22, 2021

Release notes

[rn:skip]

What does this PR do?

Adds topic with considerations and config info related to in-memory queue. This approach facilitates easier comparison and makes info on par with PQ info.

Why is it important/What is the impact to the user?

Users need more info on sizing the internal memory queue and need to know that settings are per pipeline.

Related: #13232

PREVIEW: https://logstash_13246.docs-preview.app.elstc.co/guide/en/logstash/master/memory-queue.html

@karenzone
Copy link
Contributor Author

@robbavey Let's be sure we're on the same page with this approach, please.

@karenzone karenzone marked this pull request as ready for review September 29, 2021 16:44
@karenzone karenzone requested a review from robbavey September 29, 2021 16:44
docs/static/resiliency.asciidoc Outdated Show resolved Hide resolved
docs/static/mem-queue.asciidoc Show resolved Hide resolved
This mechanism helps Logstash control the rate of data flow at the input stage
without overwhelming outputs like Elasticsearch.

ToDo: Is the next paragraph accurate for MQ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite right... Its a little more complicated for this plugin - the beats input does accept new connections (unfortunately... see this PR for an attempted mitigation), but it will not accept new events from existing connected beats agents, until filter and output stages finish processing existing events

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As that PR is WIP, does it make sense to delete this part:

For example, when the
<<plugins-inputs-beats,beats input>> encounters back pressure, it no longer
accepts new connections.
It waits until the queue has space to accept more events.
After the filter and output stages finish processing existing
events in the queue and ACKs them, Logstash automatically starts accepting new
events.

If not, what's your recommendation on making this right?

Copy link
Contributor Author

@karenzone karenzone Sep 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenzone to open an issue to detail how different inputs handle backpressure: #13257

@karenzone karenzone requested a review from robbavey September 29, 2021 21:06
docs/static/mem-queue.asciidoc Outdated Show resolved Hide resolved
==== Memory queue size

Memory queue size is not configured directly.
Multiply the `pipeline.batch.size` and `pipeline.workers` values to get the size of the memory queue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size is kind of interesting. How we define size in memory queues is different to PQ - PQ we talk about the number of bytes on disk, whereas memory queue is defined by the number of events, which can vary greatly depending on the event payload.

Maybe something like "The maximum number of events that can be held in each memory queue is equal to the value of pipeline.batch.size multiplied by pipeline.batch.size" (maybe add defaults here?).".

I don't know if we need any content referring to the fact that each pipeline has its own queue here as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline->pipeline has its own complexity, but I think that might be something we could talk about in a separate PR. Maybe in an advanced section, or pipeline->pipeline considerations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline->pipeline has its own complexity, but I think that might be something we could talk about in a separate PR. Maybe in an advanced section, or pipeline->pipeline considerations?

New issue to track: #13275

@karenzone karenzone requested a review from robbavey October 4, 2021 23:24
Copy link
Member

@robbavey robbavey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion, but I think we're really close

docs/static/mem-queue.asciidoc Outdated Show resolved Hide resolved
Copy link
Member

@robbavey robbavey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

karenzone added a commit to karenzone/logstash that referenced this pull request Oct 5, 2021
karenzone added a commit that referenced this pull request Oct 5, 2021
karenzone added a commit that referenced this pull request Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants