Upgrading Logstash to 2.2
editUpgrading Logstash to 2.2
editLogstash 2.2 re-architected the pipeline stages to provide more performance and help future enhancements in resiliency.
The new pipeline introduced micro-batches, processing groups of events at a time. The default batch size is
125 per worker. Also, the filter and output stages are executed in the same thread, but still, as different stages.
The CLI flag --pipeline-workers
or -w
control the number of execution threads, which is set by default to number of cores.
Considerations for Elasticsearch Output
The default batch size of the pipeline is 125 events per worker. This will by default also be the bulk size
used for the elasticsearch output. The Elasticsearch output’s flush_size
now acts only as a maximum bulk
size (still defaulting to 500). For example, if your pipeline batch size is 3000 events, Elasticsearch
Output will send 500 events at a time, in 6 separate bulk requests. In other words, for Elasticsearch output,
bulk request size is chunked based on flush_size
and --pipeline-batch-size
. If flush_size
is set greater
than --pipeline-batch-size
, it is ignored and --pipeline-batch-size
will be used.
The default number of output workers in Logstash 2.2 is now equal to the number of pipeline workers (-w
)
unless overridden in the Logstash config file. This can be problematic for some users as the
extra workers may consume extra resources like file handles, especially in the case of the Elasticsearch
output. Users with more than one Elasticsearch host may want to override the workers
setting
for the Elasticsearch output in their Logstash config to constrain that number to a low value, between 1 to 4.
Performance Tuning in 2.2
Since both filters and output workers are on the same thread, this could lead to threads being idle in I/O wait state.
Thus, in 2.2, you can safely set -w
to a number which is a multiple of the number of cores on your machine.
A common way to tune performance is keep increasing the -w
beyond the # of cores until performance no longer
improves. A note of caution - make sure you also keep heapsize in mind, because the number of in-flight events
are #workers * batch_size * average_event size
. More in-flight events could add to memory pressure, eventually
leading to Out of Memory errors. You can change the heapsize in Logstash by setting LS_HEAP_SIZE