Logstash 5.0.0 Release Notes

edit

Logstash 5.0.0 Release Notes

edit
  • A new monitoring feature provides runtime visibility into the Logstash pipeline and its plugins. This component collects various kinds of operational metrics while Logstash processes your data, and all of this information can be queried using simple APIs. Please refer to Monitoring APIs for details.
  • Improved throughput performance across the board (up by 2x in some configs) by implementing Event representation in Java. Event is the main object that encapsulates data as it flows through Logstash and provides APIs for the plugins to perform processing. This change also enables faster serialization for future persistence work (Issue 4191).
  • Breaking Change: Renamed filenames for Debian and RPM artifacts to match Elasticsearch’s naming scheme. The metadata is still the same, so upgrades will not be affected. Please refer to the new directory structure. If you have automated downloads for Logstash, please make sure you use the updated URLs (Issue 5100).
  • Introduced a new way to configure application settings for Logstash through a logstash.yml file. This file is typically located in LS_HOME/config or /etc/logstash when installed via packages. Logstash will not be able to start without this file, so please make sure to pass in path.settings if you are starting Logstash manually after installing it via a package (RPM or DEB) (Issue 4401).
  • Breaking Change: Most of the long form CLI options have been renamed to match the settings defined in logstash.yml.
  • Breaking Change: For plugin developers, the Event class has a new API to access its data. You will no longer be able to directly use the Event class through the Ruby hash paradigm. All the plugins packaged with Logstash have been updated to use the new API, and their versions have been bumped to the next major. Please refer to Event API for details (Issue 5141).
  • Breaking Change: Environment variables inside the Logstash config are evaluated by default. There is no need to specify the --allow-env feature flag.
  • Breaking Change: Renamed bin/plugin to bin/logstash-plugin. This is to prevent PATH being polluted when other components of the Elastic Stack are installed on the same instance (Issue 4891).
  • Added support for DEBUG=1 when running any plugin-related commands. This option gives more information that is useful when debugging unexpected behavior in bin/logstash-plugin
  • Logging Changes: Migrated Logstash’s internal logging framework to Log4j2. This enhancement provides the following features:

    • Support for changing the Logging level dynamically at runtime through REST endpoints. New APIs have been exposed under _node/logging to update log levels. You can also list all existing loggers by sending a GET request to this API.
    • Configurable file rotation policy for logs. The default is per-day.
    • Support for component-level or plugin-level log settings.
    • Unified logging across Logstash’s Java and Ruby code.
    • Logs are now placed in the LS_HOME/logs directory, which is configurable via the path.logs setting. For deb/rpm packages logs are placed in /var/log/logstash/ by default.
    • Changed the default log severity level to INFO instead of WARN to match Elasticsearch.
    • Logstash can now emit its log in structured, JSON format. Specify log.format=json in the settings file or via the command line (Issue 1569).
  • Added support for systemd and upstart so you can now manage Logstash as a service on most Linux distributions (Issue 5012).
  • Fixed a bug where Logstash would not shutdown if CTRL-C was used while using stdin input in the configuration (Issue 1769).
  • Created a new LS_HOME/data directory to store plugin states, Logstash instance UUID, and more. This directory location is configurable via the path.data setting in the logstash.yml settings file (Issue 5404).
  • Made bin/logstash -V/--version run faster on Unix platforms.
  • Introduced a performance optimization called bi-values to store both JRuby and Java object types. This optimization benefits plugins written in Ruby.
  • Show meaningful error messages for unknown CLI commands (Issue 5748).
  • Added ability to configure custom garbage collection log file using $LS_LOG_DIR.
  • Plugin Developers: Improved nomenclature and methods for threadsafe outputs. Removed the workers_not_supported method (Issue 5662).

Input Plugins

edit

Beats:

  • Improved throughput performance by reimplementing the beats input plugin in Java and using Netty, an asynchronous I/O library. These changes resulted in up to 50% gains in throughput performance while preserving the original plugin functionality (Issue 92).

JDBC:

  • Added the charset config option to support setting the character encoding for strings that are not in UTF-8 format. You can use the columns_charset option to override this encoding setting for individual columns (Issue 143).

Kafka:

  • Added support for Kafka broker 0.10. This plugin now supports SSL based encryption. This release changed a lot of configuration, so it is not backward compatible. Also, this version will not work with older Kafka brokers.

HTTP:

  • Fixed a bug where the HTTP input plugin blocked the node stats API (Issue 51).

HTTP Poller:

  • Added meaningful error messages for missing trust store/keystore passwords. Also documented the creation of a custom keystore.

RabbitMQ:

  • Removed verify_ssl option, which was never used previously. To validate SSL certs, use the ssl_certificate_path and ssl_certificate_password config options (Issue 82).

Stdin:

  • This plugin is now non-blocking, so you can use CTRL-C to stop Logstash.

Elasticsearch:

  • This plugin is now compatible with Elasticsearch 5.0.0. Scan search type has been replaced by scroll.

UDP:

  • Fixed performance regression due to IO.select being called for every packet (Issue 21).

Filter Plugins

edit

Grok:

  • Added support to cancel long-running execution. Many times users write runaway regular expressions that lead to a stalled Logstash. You can configure timeout_millis to cancel the current execution and continue processing the event downstream (Issue 82).
  • Added a stats counter on grok matches and failures. This is exposed in the _node/stats/pipeline endpoint.

Date:

  • Added a stats counter on grok matches and failures. This is exposed in the _node/stats/pipeline endpoint.

GeoIP:

  • Added support for the GeoIP2 city database and support for IPv6 lookups (Issue 23).

DNS:

  • Improved performance by adding caches to both successful and failed requests.
  • Added support for retrying with the :max_retries setting.
  • Lowered the default value of timeout from 2 to 0.5 seconds.

CSV:

  • Added the autodetect_column_names option to read column names from the header.

XML:

  • Breaking Change: Added a new configuration called suppress_empty, which defaults to true. This changes the default behaviour of the plugin in favor of avoiding mapping conflicts when reaching Elasticsearch (Issue 24).
  • Added a new configuration called force_content. By default, the filter expands attributes differently for content in XML elements. This option allows you to force text content and attributes to always parse to a hash value (Issue 14).
  • Fixed a bug that ensures that a target is set when storing XML content in the event (store_xml => true).

Output Plugins

edit

Elasticsearch:

  • Breaking Change: The index template for 5.0 has been changed to reflect Elasticsearch’s mapping changes. Most importantly, the subfield for string multi-fields has changed from .raw to .keyword to match Elasticsearch’s default behavior (Issue 386). See Breaking changes for details about how this change affects new and existing users.
  • Added check_connection_timeout parameter, which has a default of 10m.
  • Added the ability for the plugin to choose which default template to use based on the Elasticsearch version (Issue 401).
  • Elasticserach output is now fully threadsafe. This means internal resources can be shared among multiple output { elasticsearch {} } instances.
  • Added sniffing improvements so any current connections don’t have to be closed/reopened after a sniff round.
  • Introduced a connection pool to reuse connections to Elasticsearch backends.
  • Added exponential backoff to connection retries with a ceiling of retry_max_interval, which is the most time to wait between retries, and retry_initial_interval, which is the initial amount of time to wait. The value of retry_initial_interval increases exponentially between retries until a request succeeds.
  • Added support for specifying ingest pipelines (Issue 410).

Tcp:

  • Added SSL/TLS support for certificate-based encryption.

Kafka:

  • Made this output a shareable instance across multiple pipeline workers. This ensures efficient use of resources like broker TCP connections, internal producer buffers, and so on.