Parse data using ingest node pipelines

edit

With the Elasticsearch output you can configure the APM Server to use ingest node to pre-process documents before indexing them. A pipeline defines processors that operate on your data. For example, a pipeline can define one processor to remove a field and another to rename a field.

Using pipelines involves two steps. First you need to register a pipeline in Elasticsearch. Then the pipeline needs to be applied during data ingestion.

Registering Pipelines in Elasticsearch

edit

For registering a pipeline in Elasticsearch you can either manually upload a pipeline definition or you can configure APM Server to register pipelines on startup.

Register pipelines on APM Server startup

edit

Automatic pipeline registration requires output.elasticsearch to be enabled and configured, as well as having the required Elasticsearch plugins installed.

APM Server default pipelines require you to have the Ingest user agent plugin installed. If pipelines are enabled but required plugins are missing, the APM Server will not be able to properly connect to the Elasticsearch output. Find the default pipeline configuration at ingest/pipeline/definition.json of your APM Server installation’s home directory. In case you want to add, change or remove pipelines in Elasticsearch, you can simply change the definitions in this file and restart your APM Server or run apm-server setup --pipelines

By default pipeline registration is disabled. See how to configure pipeline registration.

Manually upload pipeline definitions

edit

For example, consider the following pipeline in a file named pipeline.json:

{
    "description": "Test pipeline",
    "processors": [
        {
            "lowercase": {
                "field": "beat.name"
            }
        }
    ]
}

To register the pipeline run:

curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d @pipeline.json

This pipeline definition converts the value of beat.name to lowercase before indexing each document.

Apply pipelines during data ingestion

edit

To specify which pipelines to apply during data ingestion, add the pipeline IDs in the pipelines option under elasticsearch in the apm-server.yml file:

output.elasticsearch:
  pipelines:
  - pipeline: "test-pipeline"

For more information about defining a pre-processing pipeline, see the Ingest Node documentation.