diff --git a/docs/static/mem-queue.asciidoc b/docs/static/mem-queue.asciidoc index b29be1e260c..b30f4f513e9 100644 --- a/docs/static/mem-queue.asciidoc +++ b/docs/static/mem-queue.asciidoc @@ -1,5 +1,5 @@ [[memory-queue]] -=== Memory queue +=== Memory queue (in-memory queue?) By default, Logstash uses in-memory bounded queues between pipeline stages (inputs → pipeline workers) to buffer events. The size of these in-memory @@ -8,3 +8,103 @@ machine failure, the contents of the in-memory queue will be lost. Temporary mac failures are scenarios where Logstash or its host machine are terminated abnormally but are capable of being restarted. + +[[mem-queue-benefits]] +==== Benefits of memory queue + +The memory queue might be a good choice if you value throughput over data resiliency. + +* Easier configuration +* Easier management and administration +* Faster throughput + + +[[mem-queue-limitations]] +==== Limitations of memory queue + +* Can lose data in abnormal termination +* Not a good choice for data you can't afford to lose + + +[[configuring-mem-queue]] +==== Configuring in-memory queue + +// Notes: mem queue is default. +//ToDo: Check into single sourcing settings for use with PQ and MQ + + +///// +Adjust text and placement to avoid redundancy between PQ and MQ. +Maybe document under "Resiliency" and link back to it from here? +Use same approach for PQ + +[[backpressure-mem-queue]] +==== Handling Back Pressure + +When the queue is full, Logstash puts back pressure on the inputs to stall data +flowing into Logstash. This mechanism helps Logstash control the rate of data +flow at the input stage without overwhelming outputs like Elasticsearch. + +Use `queue.max_bytes` setting to configure the total capacity of the queue on +disk. The following example sets the total capacity of the queue to 8gb: + +[source, yaml] +queue.type: persisted +queue.max_bytes: 8gb + +With these settings specified, Logstash will buffer events on disk until the +size of the queue reaches 8gb. When the queue is full of unACKed events, and +the size limit has been reached, Logstash will no longer accept new events. + +Each input handles back pressure independently. For example, when the +<> input encounters back pressure, it no longer +accepts new connections and waits until the persistent queue has space to accept +more events. After the filter and output stages finish processing existing +events in the queue and ACKs them, Logstash automatically starts accepting new +events. + +///// + + +///// + +Is this concept applicable for MQ? +[[durability-mq]] +==== Controlling Durability for memory queue + +Durability is a property of storage writes that ensures data will be available after it's written. + +When the persistent queue feature is enabled, Logstash will store events on +disk. Logstash commits to disk in a mechanism called checkpointing. + +To discuss durability, we need to introduce a few details about how the persistent queue is implemented. + +First, the queue itself is a set of pages. There are two kinds of pages: head pages and tail pages. The head page is where new events are written. There is only one head page. When the head page is of a certain size (see `queue.page_capacity`), it becomes a tail page, and a new head page is created. Tail pages are immutable, and the head page is append-only. +Second, the queue records details about itself (pages, acknowledgements, etc) in a separate file called a checkpoint file. + +When recording a checkpoint, Logstash will: + +* Call fsync on the head page. +* Atomically write to disk the current state of the queue. + +The process of checkpointing is atomic, which means any update to the file is saved if successful. + +If Logstash is terminated, or if there is a hardware-level failure, any data +that is buffered in the persistent queue, but not yet checkpointed, is lost. + +You can force Logstash to checkpoint more frequently by setting +`queue.checkpoint.writes`. This setting specifies the maximum number of events +that may be written to disk before forcing a checkpoint. The default is 1024. To +ensure maximum durability and avoid losing data in the persistent queue, you can +set `queue.checkpoint.writes: 1` to force a checkpoint after each event is +written. Keep in mind that disk writes have a resource cost. Setting this value +to `1` can severely impact performance. +///// + +///// +Applicable for MQ? +[[garbage-collection-mq]] +==== Disk Garbage Collection + +On disk, the queue is stored as a set of pages where each page is one file. Each page can be at most `queue.page_capacity` in size. Pages are deleted (garbage collected) after all events in that page have been ACKed. If an older page has at least one event that is not yet ACKed, that entire page will remain on disk until all events in that page are successfully processed. Each page containing unprocessed events will count against the `queue.max_bytes` byte size. +///// diff --git a/docs/static/resiliency.asciidoc b/docs/static/resiliency.asciidoc index 9f21ba427a5..13b5fdd7b9e 100644 --- a/docs/static/resiliency.asciidoc +++ b/docs/static/resiliency.asciidoc @@ -1,6 +1,28 @@ [[resiliency]] == Data resiliency + +///// +What happens when the queue is full? +Input plugins push data into the queue, and filters pull out. If the queue (persistent or memory) is full then the input plugin thread blocks. + +See handling backpressure topic. Relocate this info for better visibility? +///// + + +///// +Settings in logstash.yml and pipelines.yml can interract in unintuitive ways + +A setting on a pipeline in pipelines.yml takes precedence, falling back to the value in logstash.yml if there is no setting present for the specific pipeline, falling back to the default if there is no value present in logstash.yml + +^^ This is true for any setting in both logstash.yml and pipelines.yml, but seems to trip people up in PQs. Other queues, too? +///// + + +//ToDo: Add MQ to discussion (for compare/constrast), even thought it's not really considered a "resiliency feature". Messaging will need to be updated. + + + As data flows through the event processing pipeline, Logstash may encounter situations that prevent it from delivering events to the configured output. For example, the data might contain unexpected data types, or