Stalled Shutdown Detection

edit

Shutting down a running Logstash instance involves the following steps:

  • Stop all input, filter and output plugins
  • Process all in-flight events
  • Terminate the Logstash process

The following conditions affect the shutdown process:

  • An input plugin receiving data at a slow pace.
  • A slow filter, like a Ruby filter executing sleep(10000) or an Elasticsearch filter that is executing a very heavy query.
  • A disconnected output plugin that is waiting to reconnect to flush in-flight events.

These situations make the duration and success of the shutdown process unpredictable.

Logstash has a stall detection mechanism that analyzes the behavior of the pipeline and plugins during shutdown. This mechanism produces periodic information about the count of inflight events in internal queues and a list of busy worker threads.

To enable Logstash to forcibly terminate in the case of a stalled shutdown, use the --allow-unsafe-shutdown flag when you start Logstash.

Stall Detection Example

edit

In this example, slow filter execution prevents the pipeline from clean shutdown. By starting Logstash with the --allow-unsafe-shutdown flag, quitting with Ctrl+C results in an eventual shutdown that loses events.

bin/logstash -e 'input { generator { } } filter { ruby { code => "sleep 10000" } }
  output { stdout { codec => dots } }' -w 1 --allow-unsafe-shutdown
Settings: User set pipeline workers: 1, Default pipeline workers: 8
Pipeline main started
^CSIGINT received. Shutting down the agent. {:level=>:warn}
stopping pipeline {:id=>"main"}
Received shutdown signal, but pipeline is still waiting for in-flight events
to be processed. Sending another ^C will force quit Logstash, but this may cause
data loss. {:level=>:warn}
{"inflight_count"=>125, "stalling_thread_info"=>{["LogStash::Filters::Ruby",
{"code"=>"sleep 10000"}]=>[{"thread_id"=>17, "name"=>"[main]>worker0",
"current_call"=>"(ruby filter code):1:in `sleep'"}]}} {:level=>:warn}
The shutdown process appears to be stalled due to busy or blocked plugins.
Check the logs for more information. {:level=>:error}
{"inflight_count"=>125, "stalling_thread_info"=>{["LogStash::Filters::Ruby",
{"code"=>"sleep 10000"}]=>[{"thread_id"=>17, "name"=>"[main]>worker0",
"current_call"=>"(ruby filter code):1:in `sleep'"}]}} {:level=>:warn}
{"inflight_count"=>125, "stalling_thread_info"=>{["LogStash::Filters::Ruby",
{"code"=>"sleep 10000"}]=>[{"thread_id"=>17, "name"=>"[main]>worker0",
"current_call"=>"(ruby filter code):1:in `sleep'"}]}} {:level=>:warn}
Forcefully quitting logstash.. {:level=>:fatal}

When --allow-unsafe-shutdown isn’t enabled, Logstash continues to run and produce these reports periodically.