Logstash plugins
editLogstash plugins
editThe power of Logstash is in the plugins—inputs, outputs, filters, and codecs.
In Logstash on ECK, you can use the same plugins that you use for other Logstash instances—including Elastic-supported, community-supported, and custom plugins. However, you may have other factors to consider, such as how you configure your Kubernetes resources, how you specify additional resources, and how you scale your Logstash installation.
In this section, we’ll cover:
Providing additional resources for plugins
editThe plugins in your pipeline can impact how you can configure your Kubernetes resources, including the need to specify additional resources in your manifest. The most common resources you need to allow for are:
- Read-only assets, such as private keys, translate dictionaries, or JDBC drivers
- Writable storage to save application state
Read-only assets
editMany plugins require or allow read-only assets in order to work correctly. These may be ConfigMaps or Secrets files that have a 1 MiB limit, or larger assets such as JDBC drivers, that need to be stored in a PersistentVolume.
ConfigMaps and Secrets (1 MiB max)
editEach instance of a ConfigMap
or Secret
has a maximum size of 1 MiB (mebibyte).
For larger read-only assets, check out Larger read-only assets (1 MiB+).
In the plugin documentation, look for configurations that call for a path
or an array
of paths
.
Sensitive assets, such as private keys
Some plugins need access to private keys or certificates in order to access an external resource. Make the keys or certificates available to the Logstash resource in your manifest.
These settings are typically identified by an ssl_
prefix, such as ssl_key
, ssl_keystore_path
, ssl_certificate
, for example.
To use these in your manifest, create a Secret representing the asset, a Volume in your podTemplate.spec
containing that Secret, and then mount that Volume with a VolumeMount in the podTemplateSpec.container
section of your Logstash resource.
First, create your secrets.
kubectl create secret generic logstash-crt --from-file=logstash.crt kubectl create secret generic logstash-key --from-file=logstash.key
Then, create your Logstash resource.
spec: podTemplate: spec: volumes: - name: logstash-ssl-crt secret: secretName: logstash-crt - name: logstash-ssl-key secret: secretName: logstash-key containers: - name: logstash volumeMounts: - name: logstash-ssl-key mountPath: "/usr/share/logstash/data/logstash.key" readOnly: true - name: logstash-ssl-crt mountPath: "/usr/share/logstash/data/logstash.crt" readOnly: true pipelines: - pipeline.id: main config.string: | input { http { port => 8443 ssl_certificate => "/usr/share/logstash/data/logstash.crt" ssl_key => "/usr/share/logstash/data/logstash.key" } }
Static read-only files
Some plugins require or allow access to small static read-only files.
You can use these for a variety of reasons.
Examples include adding custom grok
patterns for logstash-filter-grok
to use for lookup, source code for [logstash-filter-ruby
], a dictionary for logstash-filter-translate
or the location of a SQL statement for logstash-input-jdbc
.
Make these files available to the Logstash resource in your manifest.
In the plugin documentation, these plugin settings are typically identified by path
or an array
of paths
.
To use these in your manifest, create a ConfigMap or Secret representing the asset, a Volume in your podTemplate.spec
containing the ConfigMap or Secret, and mount that Volume with a VolumeMount in your podTemplateSpec.container
section of your Logstash resource.
This example illustrates configuring a ConfigMap from a ruby source file, and including it in a logstash-filter-ruby
plugin.
First, create the ConfigMap.
kubectl create configmap ruby --from-file=drop_some.rb
Then, create your Logstash resource.
spec: podTemplate: spec: volumes: - name: ruby_drop configMap: name: ruby containers: - name: logstash volumeMounts: - name: ruby_drop mountPath: "/usr/share/logstash/data/drop_percentage.rb" readOnly: true pipelines: - pipeline.id: main config.string: | input { beats { port => 5044 } } filter { ruby { path => "/usr/share/logstash/data/drop_percentage.rb" script_params => { "percentage" => 0.9 } } }
Larger read-only assets (1 MiB+)
editSome plugins require or allow access to static read-only files that exceed the 1 MiB (mebibyte) limit imposed by ConfigMap and Secret.
For example, you may need JAR files to load drivers when using a JDBC or JMS plugin, or a large logstash-filter-translate
dictionary.
You can add files using:
-
PersistentVolume populated by an initContainer. Add a volumeClaimTemplate and a volumeMount to your Logstash resource and upload data to that volume, either using an
initContainer
, or direct upload if your Kubernetes provider supports it. You can use the defaultlogstash-data
volumeClaimTemplate , or a custom one depending on your storage needs. - Custom Docker image. Use a custom docker image that includes the static content that your Logstash pods will need.
Check out Custom configuration files and plugins for more details on which option might be most suitable for you.
Add files using PersistentVolume populated by an initContainer
editThis example creates a volumeClaimTemplate called workdir
, with volumeMounts referring to this mounted to the main container and an initContainer. The initContainer initiates a download of a PostgreSQL JDBC driver JAR file, and stored it the volumeMount, which is then used in the JDBC input in the pipeline configuration.
spec: podTemplate: spec: initContainers: - name: download-postgres command: ["/bin/sh"] args: ["-c", "curl -o /data/postgresql.jar -L https://jdbc.postgresql.org/download/postgresql-42.6.0.jar"] volumeMounts: - name: workdir mountPath: /data containers: - name: logstash volumeMounts: - name: workdir mountPath: /usr/share/logstash/jars volumeClaimTemplates: - metadata: name: workdir spec: accessModes: - ReadWriteOnce resources: requests: storage: 50Mi pipelines: - pipeline.id: main config.string: | input { jdbc { jdbc_driver_library => "/usr/share/logstash/jars/postgresql.jar" jdbc_driver_class => "org.postgresql.Driver" } }
Add files using a custom Docker image
editThis example downloads the same postgres
JDBC driver, and adds it to the Logstash classpath in the Docker image.
First, create a Dockerfile based on the Logstash Docker image. Download the JDBC driver, and save it alongside the other JAR files in the Logstash classpath:
FROM docker.elastic.co/logstash/logstash:8.15.3 RUN curl -o /usr/share/logstash/logstash-core/lib/jars/postgresql.jar -L https://jdbc.postgresql.org/download/postgresql-42.6.0.jar
Placing the JAR file in the |
After you build and deploy the custom image, include it in the Logstash manifest. Check out Create custom images for more details.
count: 1 version: {version} image: <CUSTOM_IMAGE> pipelines: - pipeline.id: main config.string: | input { jdbc { jdbc_driver_class => "org.postgresql.Driver" } }
The correct version is required as ECK reasons about available APIs and capabilities based on the version field. |
|
Note that when you place the JAR file on the Logstash classpath, you do not need to specify the |
|
Remainder of plugin configuration goes here |
Writable storage
editSome Logstash plugins need access to writable storage.
This could be for checkpointing to keep track of events already processed, a place to temporarily write events before sending a batch of events, or just to actually write events to disk in the case of logstash-output-file
.
Logstash on ECK by default supplies a small 1.5 GiB (gibibyte) default persistent volume to each pod.
This volume is called logstash-data
and is located at /usr/logstash/data
, and is typically the default location for most plugin use cases.
This volume is stable across restarts of Logstash pods and is suitable for many use cases.
When plugins use writable storage, each plugin must store its data a dedicated folder or file to avoid overwriting data.
Checkpointing
editSome Logstash plugins need to write "checkpoints" to local storage in order to keep track of events that have already been processed. Plugins that retrieve data from external sources need to do this if the external source does not provide any mechanism to track state internally.
Not all external data sources have mechanisms to track state internally, and Logstash checkpoints can help persist data.
In the plugin documentation, look for configurations that call for a path
with a settings like sincedb
, sincedb_path
, sequence_path
, or last_run_metadata_path
. Check out specific plugin documentation in the Logstash Reference for details.
spec: pipelines: - pipeline.id: main config.string: | input { jdbc { jdbc_driver_library => "/usr/share/logstash/jars/postgresql.jar" jdbc_driver_class => "org.postgresql.Driver" last_metadata_path => "/usr/share/logstash/data/main/logstash_jdbc_last_run } }
If you are using more than one plugin of the same type, specify a unique location for each plugin to use. |
If the default logstash-data
volume is insufficient for your needs, see the volume section for details on how to add additional volumes.
Writable staging or temporary data
editSome Logstash plugins write data to a staging directory or file before processing for input, or outputting to their final destination. Often these staging folders can be persisted across restarts to avoid duplicating processing of data.
In the plugin documentation, look for names such as tmp_directory
, temporary_directory
, staging_directory
.
To persist data across pod restarts, set this value to point to the default logstash-data
volume or your own PersistentVolumeClaim.
Scaling Logstash on ECK
editThe use of autoscalers, such as the HorizontalPodAutoscaler or the VerticalPodAutoscaler, with Logstash on ECK is not yet supported.
Logstash scalability is highly dependent on the plugins in your pipelines. Some plugins can restrict how you can scale out your Logstash deployment, based on the way that the plugins gather or enrich data.
Plugin categories that require special considerations are:
If the pipeline does not contain any plugins from these categories, you can increase the number of Logstash instances by setting the count
property in the Logstash resource:
apiVersion: logstash.k8s.elastic.co/v1alpha1 kind: Logstash metadata: name: quickstart spec: version: 8.15.3 count: 3
Filter plugins: aggregating filters
editLogstash installations that use aggregating filters should be treated with particular care:
-
They must specify
pipeline.workers=1
for any pipelines that use them. - The number of pods cannot be scaled above 1.
Examples of aggregating filters include logstash-filter-aggregate
, logstash-filter-csv
when autodetect_column_names
set to true
, and any logstash-filter-ruby
implementations that perform aggregations.
Input plugins: events pushed to Logstash
editLogstash installations with inputs that enable Logstash to receive data should be able to scale freely and have load spread across them horizontally.
These plugins include logstash-input-beats
, logstash-input-elastic_agent
, logstash-input-tcp
, and logstash-input-http
.
Input plugins: Logstash maintains state
editLogstash installations that use input plugins that retrieve data from an external source, and maintain local checkpoint state, or would require some level of co-ordination between nodes to split up work can specify pipeline.workers
freely, but should keep the pod count at 1 for each Logstash installation.
Note that plugins that retrieve data from external sources, and require some level of coordination between nodes to split up work, are not good candidates for scaling horizontally, and would likely produce some data duplication.
Input plugins that include configuration settings such as sincedb
, checkpoint
or sql_last_run_metadata
may fall into this category.
Examples of these plugins include logstash-input-jdbc
(which has no automatic way to split queries across Logstash instances), logstash-input-s3
(which has no way to split which buckets to read across Logstash instances), or logstash-input-file
.
Input plugins: external source stores state
editLogstash installations that use input plugins that retrieve data from an external source, and rely on the external source to store state can scale based on the parameters of the external source.
For example, a Logstash installation that uses a logstash-input-kafka
plugin to retrieve data can scale the number of pods up to the number of partitions used, as a partition can have at most one consumer belonging to the same consumer group.
Any pods created beyond that threshold cannot be scheduled to receive data.
Examples of these plugins include logstash-input-kafka
, logstash-input-azure_event_hubs
, and logstash-input-kinesis
.
Plugin-specific considerations
editSome plugins have additional requirements and guidelines for optimal performance in a Logstash ECK environment.
Use these guidelines in addition to the general guidelines provided in Scaling Logstash on ECK.
Logstash integration plugin
editWhen your pipeline uses the Logstash integration
plugin, add keepalive=>false
to the logstash-output definition to ensure that load balancing works correctly rather than keeping affinity to the same pod.
Elasticsearch output plugin
editThe elasticsearch output
plugin requires certain roles to be configured in order to enable Logstash to communicate with Elasticsearch.
You can customize roles in Elasticsearch. Check out creating custom roles
kind: Secret apiVersion: v1 metadata: name: my-roles-secret stringData: roles.yml: |- eck_logstash_user_role: "cluster": ["monitor", "manage_ilm", "read_ilm", "manage_logstash_pipelines", "manage_index_templates", "cluster:admin/ingest/pipeline/get"], "indices": [ { "names": [ "logstash", "logstash-*", "ecs-logstash", "ecs-logstash-*", "logs-*", "metrics-*", "synthetics-*", "traces-*" ], "privileges": ["manage", "write", "create_index", "read", "view_index_metadata"] } ]
Elastic_integration filter plugin
editThe elastic_integration filter
plugin allows the use of ElasticsearchRef
and environment variables.
elastic_integration { pipeline_name => "logstash-pipeline" hosts => [ "${ECK_ES_HOSTS}" ] username => "${ECK_ES_USER}" password => "${ECK_ES_PASSWORD}" ssl_certificate_authorities => "${ECK_ES_SSL_CERTIFICATE_AUTHORITY}" }
The Elastic_integration filter requires certain roles to be configured on the Elasticsearch cluster to enable Logstash to read ingest pipelines.
# Sample role definition kind: Secret apiVersion: v1 metadata: name: my-roles-secret stringData: roles.yml: |- eck_logstash_user_role: cluster: [ "monitor", "manage_index_templates", "read_pipeline"]
Elastic Agent input and Beats input plugins
editWhen you use the Elastic Agent input or the Beats input,
set the ttl
value on the Agent or Beat to ensure that load is distributed appropriately.
Adding custom plugins
editIf you need plugins in addition to those included in the standard Logstash distribution, you can add them.
Create a custom Docker image that includes the installed plugins, using the bin/logstash-plugin install
utility to add more plugins to the image so that they can be used by Logstash pods.
This sample Dockerfile installs the logstash-filter-tld
plugin to the official Logstash Docker image:
FROM docker.elastic.co/logstash/logstash:8.15.3 RUN bin/logstash-plugin install logstash-filter-tld
Then after building and deploying the custom image (refer to Create custom images for more details), include it in the Logstash manifest: