From 1a2cfad03c6742c539f91720d8ef280313972f7c Mon Sep 17 00:00:00 2001 From: Karen Metts <35154725+karenzone@users.noreply.github.com> Date: Tue, 5 Oct 2021 18:12:02 -0400 Subject: [PATCH] Doc: Add topic and expand info for in-memory queue (#13246) Co-authored-by: Rob Bavey --- docs/index.asciidoc | 3 ++ docs/static/mem-queue.asciidoc | 61 +++++++++++++++++++++++++++++++++ docs/static/resiliency.asciidoc | 4 ++- 3 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 docs/static/mem-queue.asciidoc diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 09ac00ed579..96ceb1dc307 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -180,6 +180,9 @@ include::static/filebeat-modules.asciidoc[] :edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/resiliency.asciidoc include::static/resiliency.asciidoc[] +:edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/mem-queue.asciidoc +include::static/mem-queue.asciidoc[] + :edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/persistent-queues.asciidoc include::static/persistent-queues.asciidoc[] diff --git a/docs/static/mem-queue.asciidoc b/docs/static/mem-queue.asciidoc new file mode 100644 index 00000000000..110716bacd9 --- /dev/null +++ b/docs/static/mem-queue.asciidoc @@ -0,0 +1,61 @@ +[[memory-queue]] +=== Memory queue + +By default, Logstash uses in-memory bounded queues between pipeline stages (inputs → pipeline workers) to buffer events. +If Logstash experiences a temporary machine failure, the contents of the memory queue will be lost. +Temporary machine failures are scenarios where Logstash or its host machine are terminated abnormally, but are capable of being restarted. + +[[mem-queue-benefits]] +==== Benefits of memory queues + +The memory queue might be a good choice if you value throughput over data resiliency. + +* Easier configuration +* Easier management and administration +* Faster throughput + +[[mem-queue-limitations]] +==== Limitations of memory queues + +* Can lose data in abnormal termination +* Don't do well handling sudden bursts of data, where extra capacity in needed for {ls} to catch up + +TIP: Consider using <> to avoid these limitations. + +[[sizing-mem-queue]] +==== Memory queue size + +Memory queue size is not configured directly. +It is defined by the number of events, the size of which can vary greatly depending on the event payload. + +The maximum number of events that can be held in each memory queue is equal to +the value of `pipeline.batch.size` multiplied by the value of +`pipeline.workers`. +This value is called the "inflight count." + +NOTE: Each pipeline has its own queue. + +See <> for more info on the effects of adjusting `pipeline.batch.size` and `pipeline.workers`. + +[[mq-settings]] +===== Settings that affect queue size + +These values can be configured in `logstash.yml` and `pipelines.yml`. + +pipeline.batch.size:: +Number events to retrieve from inputs before sending to filters+workers +The default is 125. + +pipelines.workers:: +Number of workers that will, in parallel, execute the filters+outputs stage of the pipeline. +This value defaults to the number of the host's CPU cores. + +[[backpressure-mem-queue]] +==== Back pressure + +When the queue is full, Logstash puts back pressure on the inputs to stall data +flowing into Logstash. +This mechanism helps Logstash control the rate of data flow at the input stage +without overwhelming outputs like Elasticsearch. + +Each input handles back pressure independently. diff --git a/docs/static/resiliency.asciidoc b/docs/static/resiliency.asciidoc index 9f21ba427a5..ff347097bda 100644 --- a/docs/static/resiliency.asciidoc +++ b/docs/static/resiliency.asciidoc @@ -1,5 +1,7 @@ [[resiliency]] -== Data resiliency +== Queues and data resiliency + +By default, Logstash uses <> between pipeline stages (inputs → pipeline workers) to buffer events. As data flows through the event processing pipeline, Logstash may encounter situations that prevent it from delivering events to the configured