Skip to content

Commit

Permalink
Doc: Add topic and expand info for in-memory queue (elastic#13246)
Browse files Browse the repository at this point in the history
Co-authored-by: Rob Bavey <[email protected]>
  • Loading branch information
karenzone and robbavey committed Oct 5, 2021
1 parent 68dc01d commit 1a2cfad
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 1 deletion.
3 changes: 3 additions & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,9 @@ include::static/filebeat-modules.asciidoc[]
:edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/resiliency.asciidoc
include::static/resiliency.asciidoc[]

:edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/mem-queue.asciidoc
include::static/mem-queue.asciidoc[]

:edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/persistent-queues.asciidoc
include::static/persistent-queues.asciidoc[]

Expand Down
61 changes: 61 additions & 0 deletions docs/static/mem-queue.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
[[memory-queue]]
=== Memory queue

By default, Logstash uses in-memory bounded queues between pipeline stages (inputs → pipeline workers) to buffer events.
If Logstash experiences a temporary machine failure, the contents of the memory queue will be lost.
Temporary machine failures are scenarios where Logstash or its host machine are terminated abnormally, but are capable of being restarted.

[[mem-queue-benefits]]
==== Benefits of memory queues

The memory queue might be a good choice if you value throughput over data resiliency.

* Easier configuration
* Easier management and administration
* Faster throughput

[[mem-queue-limitations]]
==== Limitations of memory queues

* Can lose data in abnormal termination
* Don't do well handling sudden bursts of data, where extra capacity in needed for {ls} to catch up

TIP: Consider using <<persistent-queues,persistent queues>> to avoid these limitations.

[[sizing-mem-queue]]
==== Memory queue size

Memory queue size is not configured directly.
It is defined by the number of events, the size of which can vary greatly depending on the event payload.

The maximum number of events that can be held in each memory queue is equal to
the value of `pipeline.batch.size` multiplied by the value of
`pipeline.workers`.
This value is called the "inflight count."

NOTE: Each pipeline has its own queue.

See <<tuning-logstash>> for more info on the effects of adjusting `pipeline.batch.size` and `pipeline.workers`.

[[mq-settings]]
===== Settings that affect queue size

These values can be configured in `logstash.yml` and `pipelines.yml`.

pipeline.batch.size::
Number events to retrieve from inputs before sending to filters+workers
The default is 125.

pipelines.workers::
Number of workers that will, in parallel, execute the filters+outputs stage of the pipeline.
This value defaults to the number of the host's CPU cores.

[[backpressure-mem-queue]]
==== Back pressure

When the queue is full, Logstash puts back pressure on the inputs to stall data
flowing into Logstash.
This mechanism helps Logstash control the rate of data flow at the input stage
without overwhelming outputs like Elasticsearch.

Each input handles back pressure independently.
4 changes: 3 additions & 1 deletion docs/static/resiliency.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
[[resiliency]]
== Data resiliency
== Queues and data resiliency

By default, Logstash uses <<memory-queue,in-memory bounded queues>> between pipeline stages (inputs → pipeline workers) to buffer events.

As data flows through the event processing pipeline, Logstash may encounter
situations that prevent it from delivering events to the configured
Expand Down

0 comments on commit 1a2cfad

Please sign in to comment.