Ingest node

edit

Use an ingest node to pre-process documents before the actual document indexing happens. The ingest node intercepts bulk and index requests, it applies transformations, and it then passes the documents back to the index or bulk APIs.

All nodes enable ingest by default, so any node can handle ingest tasks. To create a dedicated ingest node, configure the node.roles setting in elasticsearch.yml as follows:

node.roles: [ ingest ]

To disable ingest for a node, specify the node.roles setting and exclude ingest from the listed roles.

To pre-process documents before indexing, define a pipeline that specifies a series of processors. Each processor transforms the document in some specific way. For example, a pipeline might have one processor that removes a field from the document, followed by another processor that renames a field. The cluster state then stores the configured pipelines.

To use a pipeline, simply specify the pipeline parameter on an index or bulk request. This way, the ingest node knows which pipeline to use.

For example: Create a pipeline

PUT _ingest/pipeline/my_pipeline_id
{
  "description" : "describe pipeline",
  "processors" : [
    {
      "set" : {
        "field": "foo",
        "value": "new"
      }
    }
  ]
}

Index with defined pipeline

PUT my-index-00001/_doc/my-id?pipeline=my_pipeline_id
{
  "foo": "bar"
}

Response:

{
  "_index" : "my-index-00001",
  "_type" : "_doc",
  "_id" : "my-id",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

An index may also declare a default pipeline that will be used in the absence of the pipeline parameter.

Finally, an index may also declare a final pipeline that will be executed after any request or default pipeline (if any).

See Ingest APIs for more information about creating, adding, and deleting pipelines.