Elasticsearch input plugin
editElasticsearch input plugin
edit- Plugin version: v4.3.3
- Released on: 2019-11-12
- Changelog
For other versions, see the Versioned plugin docs.
Installation
editFor plugins not bundled by default, it is easy to install by running bin/logstash-plugin install logstash-input-elasticsearch
. See Working with plugins for more details.
Getting Help
editFor questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.
Description
editCompatibility Note
Starting with Elasticsearch 5.3, there’s an HTTP setting
called http.content_type.required
. If this option is set to true
, and you
are using Logstash 2.4 through 5.2, you need to update the Elasticsearch input
plugin to version 4.0.2 or higher.
Read from an Elasticsearch cluster, based on search query results.
This is useful for replaying test logs, reindexing, etc.
You can periodically schedule ingestion using a cron syntax
(see schedule
setting) or run the query one time to load
data into Logstash.
Example:
input { # Read all documents from Elasticsearch matching the given query elasticsearch { hosts => "localhost" query => '{ "query": { "match": { "statuscode": 200 } }, "sort": [ "_doc" ] }' } }
This would create an Elasticsearch query with the following format:
curl 'http://localhost:9200/logstash-*/_search?&scroll=1m&size=1000' -d '{ "query": { "match": { "statuscode": 200 } }, "sort": [ "_doc" ] }'
Scheduling
editInput from this plugin can be scheduled to run periodically according to a specific schedule. This scheduling syntax is powered by rufus-scheduler. The syntax is cron-like with some extensions specific to Rufus (e.g. timezone support ).
Examples:
|
will execute every minute of 5am every day of January through March. |
|
will execute on the 0th minute of every hour every day. |
|
will execute at 6:00am (UTC/GMT -5) every day. |
Further documentation describing this syntax can be found here.
Elasticsearch Input Configuration Options
editThis plugin supports the following configuration options plus the Common Options described later.
Setting | Input type | Required |
---|---|---|
a valid filesystem path |
No |
|
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
Also see Common Options for a list of options supported by all input plugins.
ca_file
edit- Value type is path
- There is no default value for this setting.
SSL Certificate Authority file in PEM encoded format, must also include any chain certificates as necessary.
docinfo
edit- Value type is boolean
-
Default value is
false
If set, include Elasticsearch document information such as index, type, and the id in the event.
It might be important to note, with regards to metadata, that if you’re
ingesting documents with the intent to re-index them (or just update them)
that the action
option in the elasticsearch output wants to know how to
handle those things. It can be dynamically assigned with a field
added to the metadata.
Example
input { elasticsearch { hosts => "es.production.mysite.org" index => "mydata-2018.09.*" query => '{ "query": { "query_string": { "query": "*" } } }' size => 500 scroll => "5m" docinfo => true } } output { elasticsearch { index => "copy-of-production.%{[@metadata][_index]}" document_type => "%{[@metadata][_type]}" document_id => "%{[@metadata][_id]}" } }
Starting with Logstash 6.0, the document_type
option is
deprecated due to the
removal of types in Logstash 6.0.
It will be removed in the next major version of Logstash.
docinfo_fields
edit- Value type is array
-
Default value is
["_index", "_type", "_id"]
If document metadata storage is requested by enabling the docinfo
option, this
option lists the metadata fields to save in the current event. See
Meta-Fields in the Elasticsearch documentation for
more information.
docinfo_target
edit- Value type is string
-
Default value is
"@metadata"
If document metadata storage is requested by enabling the docinfo
option, this option names the field under which to store the metadata
fields as subfields.
hosts
edit- Value type is array
- There is no default value for this setting.
List of one or more Elasticsearch hosts to use for querying. Each host can be either IP, HOST, IP:port, or HOST:port. The port defaults to 9200.
index
edit- Value type is string
-
Default value is
"logstash-*"
The index or alias to search. See Multi Indices documentation in the Elasticsearch documentation for more information on how to reference multiple indices.
password
edit- Value type is password
- There is no default value for this setting.
The password to use together with the username in the user
option
when authenticating to the Elasticsearch server. If set to an empty
string authentication will be disabled.
query
edit- Value type is string
-
Default value is
'{ "sort": [ "_doc" ] }'
The query to be executed. Read the Elasticsearch query DSL documentation for more information.
schedule
edit- Value type is string
- There is no default value for this setting.
Schedule of when to periodically run statement, in Cron format for example: "* * * * *" (execute query every minute, on the minute)
There is no schedule by default. If no schedule is given, then the statement is run exactly once.
scroll
edit- Value type is string
-
Default value is
"1m"
This parameter controls the keepalive time in seconds of the scrolling request and initiates the scrolling process. The timeout applies per round trip (i.e. between the previous scroll request, to the next).
size
edit- Value type is number
-
Default value is
1000
This allows you to set the maximum number of hits returned per scroll.
slices
edit- Value type is number
- There is no default value.
- Sensible values range from 2 to about 8.
In some cases, it is possible to improve overall throughput by consuming multiple distinct slices of a query simultaneously using the Sliced Scroll API, especially if the pipeline is spending significant time waiting on Elasticsearch to provide results.
If set, the slices
parameter tells the plugin how many slices to divide the work
into, and will produce events from the slices in parallel until all of them are done
scrolling.
The Elasticsearch manual indicates that there can be negative performance implications to both the query and the Elasticsearch cluster when a scrolling query uses more slices than shards in the index.
If the slices
parameter is left unset, the plugin will not inject slice
instructions into the query.
Common Options
editThe following configuration options are supported by all input plugins:
Details
edit
codec
edit- Value type is codec
-
Default value is
"json"
The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.
enable_metric
edit- Value type is boolean
-
Default value is
true
Disable or enable metric logging for this specific plugin instance by default we record all the metrics we can, but you can disable metrics collection for a specific plugin.
id
edit- Value type is string
- There is no default value for this setting.
Add a unique ID
to the plugin configuration. If no ID is specified, Logstash will generate one.
It is strongly recommended to set this ID in your configuration. This is particularly useful
when you have two or more plugins of the same type, for example, if you have 2 elasticsearch inputs.
Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.
input { elasticsearch { id => "my_plugin_id" } }
tags
edit- Value type is array
- There is no default value for this setting.
Add any number of arbitrary tags to your event.
This can help with processing later.
type
edit- Value type is string
- There is no default value for this setting.
Add a type
field to all events handled by this input.
Types are used mainly for filter activation.
The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.
If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.