Parse data using ingest node pipelines
editParse data using ingest node pipelines
editYou can configure APM Server to use an ingest node to pre-process documents before indexing them in Elasticsearch.
A pipeline is a definition of a series of processors that operate on your data. For example, a pipeline can define one processor to remove a field, and another to rename a field.
Default ingest pipeline
editBy default, register.ingest.pipeline.enabled
is set to true
.
This loads the default pipeline definition to Elasticsearch on APM Server startup.
The default pipeline is apm
. It adds user agent information to events and processes Geo-IP data,
which is especially useful for Elastic’s JavaScript RUM Agent.
You can view the pipeline configuration by navigating to the APM Server’s home directory and then
viewing ingest/pipeline/definition.json
.
To disable this, or any other pipeline, set output.elasticsearch.pipeline: _none
.
Custom pipelines
editUsing custom pipelines involves two steps:
- First, you need to register a pipeline in Elasticsearch.
- Then, the pipeline needs to be applied during data ingestion.
Register pipelines in Elasticsearch
editTo register a pipeline in Elasticsearch, you can either configure APM Server to register pipelines on startup, or you can manually upload a pipeline definition.
Register pipelines on APM Server startup
editAutomatic pipeline registration requires output.elasticsearch
to be enabled and configured.
Navigate to APM Server’s home directory and find the default pipeline configuration at
ingest/pipeline/definition.json
.
To add, change, or remove pipelines in Elasticsearch,
change the definitions in this file and restart your APM Server or run apm-server setup --pipelines
.
By default, pipeline registration is enabled.
Manually upload pipeline definitions
editYou can manually upload pipeline definitions by describing them in a file.
Consider the following sample pipeline in a file named pipeline.json
.
This pipeline definition converts the value of beat.name
to lowercase before indexing each document.
{ "description": "Test pipeline", "processors": [ { "lowercase": { "field": "beat.name" } } ] }
To register this pipeline, run:
curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d @pipeline.json
Apply pipelines during data ingestion
editTo specify which pipelines to apply during data ingestion,
add the pipeline IDs to the pipelines
option under output.elasticsearch
in the apm-server.yml
file:
output.elasticsearch: pipelines: - pipeline: "test-pipeline"
More information and examples for applying pipelines is available in the Elasticsearch output pipeline documentation.