Simulate ingest API
editSimulate ingest API
editExecutes ingest pipelines against a set of provided documents, optionally with substitute pipeline definitions. This API is meant to be used for troubleshooting or pipeline development, as it does not actually index any data into Elasticsearch.
response = client.simulate.ingest( body: { docs: [ { _index: 'my-index', _id: 'id', _source: { foo: 'bar' } }, { _index: 'my-index', _id: 'id', _source: { foo: 'rab' } } ], pipeline_substitutions: { "my-pipeline": { processors: [ { set: { field: 'field3', value: 'value3' } } ] } } } ) puts response
POST /_ingest/_simulate { "docs": [ { "_index": "my-index", "_id": "id", "_source": { "foo": "bar" } }, { "_index": "my-index", "_id": "id", "_source": { "foo": "rab" } } ], "pipeline_substitutions": { "my-pipeline": { "processors": [ { "set": { "field": "field3", "value": "value3" } } ] } } }
This replaces the existing |
Request
editPOST /_ingest/_simulate
GET /_ingest/_simulate
POST /_ingest/<target>/_simulate
GET /_ingest/<target>/_simulate
Prerequisites
edit-
If the Elasticsearch security features are enabled, you must have the
index
orcreate
index privileges to use this API.
Description
editThe simulate ingest API simulates ingesting data into an index. It executes the default and final pipeline for that index against a set of documents provided in the body of the request. If a pipeline contains a reroute processor, it follows that reroute processor to the new index, executing that index’s pipelines as well the same way that a non-simulated ingest would. No data is indexed into Elasticsearch. Instead, the transformed document is returned, along with the list of pipelines that have been executed and the name of the index where the document would have been indexed if this were not a simulation. This differs from the simulate pipeline API in that you specify a single pipeline for that API, and it only runs that one pipeline. The simulate pipeline API is more useful for developing a single pipeline, while the simulate ingest API is more useful for troubleshooting the interaction of the various pipelines that get applied when ingesting into an index.
By default, the pipeline definitions that are currently in the system are used. However, you can supply substitute pipeline definitions in the body of the request. These will be used in place of the pipeline definitions that are already in the system. This can be used to replace existing pipeline definitions or to create new ones. The pipeline substitutions are only used within this request.
Path parameters
edit-
<target>
- (Optional, string) The index to simulate ingesting into. This can be overridden by specifying an index on each document. If you provide a <target> in the request path, it is used for any documents that don’t explicitly specify an index argument.
Query parameters
edit-
pipeline
- (Optional, string) Pipeline to use as the default pipeline. This can be used to override the default pipeline of the index being ingested into.
Request body
edit-
docs
-
(Required, array of objects) Sample documents to test in the pipeline.
Properties of
docs
objects-
_id
- (Optional, string) Unique identifier for the document.
-
_index
- (Optional, string) Name of the index that the document will be ingested into.
-
_source
- (Required, object) JSON body for the document.
-
-
pipeline_substitutions
-
(Optional, map of strings to objects) Map of pipeline IDs to substitute pipeline definition objects.
Properties of pipeline definition objects
-
description
- (Optional, string) Description of the ingest pipeline.
-
on_failure
-
(Optional, array of processor objects) Processors to run immediately after a processor failure.
Each processor supports a processor-level
on_failure
value. If a processor without anon_failure
value fails, Elasticsearch uses this pipeline-level parameter as a fallback. The processors in this parameter run sequentially in the order specified. Elasticsearch will not attempt to run the pipeline’s remaining processors. -
processors
- (Required, array of processor objects) Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.
-
version
-
(Optional, integer) Version number used by external systems to track ingest pipelines.
See the
if_version
parameter above for how the version attribute is used. -
_meta
- (Optional, object) Optional metadata about the ingest pipeline. May have any contents. This map is not automatically generated by Elasticsearch.
-
deprecated
- (Optional, boolean) Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.
-
Examples
editUse pre-existing pipeline definitions
editIn this example the index index
has a default pipeline called my-pipeline
and a final
pipeline called my-final-pipeline
. Since both documents are being ingested into index
,
both pipelines are executed using the pipeline definitions that are already in the system.
response = client.simulate.ingest( body: { docs: [ { _index: 'my-index', _id: '123', _source: { foo: 'bar' } }, { _index: 'my-index', _id: '456', _source: { foo: 'rab' } } ] } ) puts response
POST /_ingest/_simulate { "docs": [ { "_index": "my-index", "_id": "123", "_source": { "foo": "bar" } }, { "_index": "my-index", "_id": "456", "_source": { "foo": "rab" } } ] }
The API returns the following response:
{ "docs": [ { "doc": { "_id": "123", "_index": "my-index", "_version": -3, "_source": { "field1": "value1", "field2": "value2", "foo": "bar" }, "executed_pipelines": [ "my-pipeline", "my-final-pipeline" ] } }, { "doc": { "_id": "456", "_index": "my-index", "_version": -3, "_source": { "field1": "value1", "field2": "value2", "foo": "rab" }, "executed_pipelines": [ "my-pipeline", "my-final-pipeline" ] } } ] }
Specify a pipeline substitution in the request body
editIn this example the index index
has a default pipeline called my-pipeline
and a final
pipeline called my-final-pipeline
. But a substitute definition of my-pipeline
is
provided in pipeline_substitutions
. The substitute my-pipeline
will be used in place of
the my-pipeline
that is in the system, and then the my-final-pipeline
that is already
defined in the system will be executed.
response = client.simulate.ingest( body: { docs: [ { _index: 'my-index', _id: '123', _source: { foo: 'bar' } }, { _index: 'my-index', _id: '456', _source: { foo: 'rab' } } ], pipeline_substitutions: { "my-pipeline": { processors: [ { uppercase: { field: 'foo' } } ] } } } ) puts response
POST /_ingest/_simulate { "docs": [ { "_index": "my-index", "_id": "123", "_source": { "foo": "bar" } }, { "_index": "my-index", "_id": "456", "_source": { "foo": "rab" } } ], "pipeline_substitutions": { "my-pipeline": { "processors": [ { "uppercase": { "field": "foo" } } ] } } }
The API returns the following response:
{ "docs": [ { "doc": { "_id": "123", "_index": "my-index", "_version": -3, "_source": { "field2": "value2", "foo": "BAR" }, "executed_pipelines": [ "my-pipeline", "my-final-pipeline" ] } }, { "doc": { "_id": "456", "_index": "my-index", "_version": -3, "_source": { "field2": "value2", "foo": "RAB" }, "executed_pipelines": [ "my-pipeline", "my-final-pipeline" ] } } ] }