Parse data using ingest node pipelines

edit

You can configure APM Server to use an ingest node to pre-process documents before indexing them in Elasticsearch.

A pipeline is a definition of a series of processors that operate on your data. For example, a pipeline can define one processor to remove a field, and another to rename a field. Using pipelines involves two steps:

  1. First, you need to register a pipeline in Elasticsearch.
  2. Then, the pipeline needs to be applied during data ingestion.

Register pipelines in Elasticsearch

edit

To register a pipeline in Elasticsearch, you can either configure APM Server to register pipelines on startup, or you can manually upload a pipeline definition.

Register pipelines on APM Server startup

edit

Automatic pipeline registration requires output.elasticsearch to be enabled and configured.

Navigate to APM Server’s home directory and find the default pipeline configuration at ingest/pipeline/definition.json. To add, change, or remove pipelines in Elasticsearch, change the definitions in this file and restart your APM Server or run apm-server setup --pipelines.

By default, pipeline registration is disabled. See how to configure pipeline registration.

Manually upload pipeline definitions

edit

You can manually upload pipeline definitions by describing them in a file. Consider the following sample pipeline in a file named pipeline.json. This pipeline definition converts the value of beat.name to lowercase before indexing each document.

{
    "description": "Test pipeline",
    "processors": [
        {
            "lowercase": {
                "field": "beat.name"
            }
        }
    ]
}

To register this pipeline, run:

curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d @pipeline.json

Apply pipelines during data ingestion

edit

To specify which pipelines to apply during data ingestion, add the pipeline IDs to the pipelines option under elasticsearch in the apm-server.yml file:

output.elasticsearch:
  pipelines:
  - pipeline: "test-pipeline"

More information on defining a pre-processing pipeline is available in the ingest node documentation.