Create a behavioral analytics collection Technical preview
Path parameters
-
The name of the analytics collection to be created or updated.
curl \
--request PUT http://api.example.com/_application/analytics/{name}
Get behavioral analytics collections Technical preview
curl \
--request GET http://api.example.com/_application/analytics
{
"my_analytics_collection": {
"event_data_stream": {
"name": "behavioral_analytics-events-my_analytics_collection"
}
},
"my_analytics_collection2": {
"event_data_stream": {
"name": "behavioral_analytics-events-my_analytics_collection2"
}
}
}
Get index information
Get high-level information about indices in a cluster, including backing indices for data streams.
Use this request to get the following information for each index in a cluster:
- shard count
- document count
- deleted document count
- primary store size
- total store size of all shards, including shard replicas
These metrics are retrieved directly from Lucene, which Elasticsearch uses internally to power indexing and search. As a result, all document counts include hidden nested documents. To get an accurate count of Elasticsearch documents, use the cat count or count APIs.
CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use an index endpoint.
Query parameters
-
bytes string
The unit used to display byte values.
Values are
b
,kb
,mb
,gb
,tb
, orpb
. -
expand_wildcards string | array[string]
The type of index that wildcard patterns can match.
-
health string
The health status used to limit returned indices. By default, the response includes indices of any health status.
Values are
green
,GREEN
,yellow
,YELLOW
,red
, orRED
. -
include_unloaded_segments boolean
If true, the response includes information from segments that are not loaded into memory.
-
pri boolean
If true, the response only includes information from primary shards.
-
time string
The unit used to display time values.
Values are
nanos
,micros
,ms
,s
,m
,h
, ord
. -
master_timeout string
Period to wait for a connection to the master node.
curl \
--request GET http://api.example.com/_cat/indices
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size dataset.size
yellow open my-index-000001 u8FNjxh8Rfy_awN11oDKYQ 1 1 1200 0 88.1kb 88.1kb 88.1kb
green open my-index-000002 nYFWZEO7TUiOjLQXBaYJpA 1 0 0 0 260b 260b 260b
Get datafeeds Added in 7.7.0
Get configuration and usage information about datafeeds.
This API returns a maximum of 10,000 datafeeds.
If the Elasticsearch security features are enabled, you must have monitor_ml
, monitor
, manage_ml
, or manage
cluster privileges to use this API.
IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get datafeed statistics API.
Path parameters
-
A numerical character string that uniquely identifies the datafeed.
Query parameters
-
allow_no_match boolean
Specifies what to do when the request:
- Contains wildcard expressions and there are no datafeeds that match.
- Contains the
_all
string or no identifiers and there are no matches. - Contains wildcard expressions and there are only partial matches.
If
true
, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. Iffalse
, the API returns a 404 status code when there are no matches or only partial matches. -
h string | array[string]
Comma-separated list of column names to display.
-
s string | array[string]
Comma-separated list of column names or column aliases used to sort the response.
-
time string
The unit used to display time values.
Values are
nanos
,micros
,ms
,s
,m
,h
, ord
.
curl \
--request GET http://api.example.com/_cat/ml/datafeeds/{datafeed_id}
id state buckets.count search.count
datafeed-high_sum_total_sales stopped 743 7
datafeed-low_request_rate stopped 1457 3
datafeed-response_code_rates stopped 1460 18
datafeed-url_scanning stopped 1460 18
Get all connector sync jobs Beta
Get information about all stored connector sync jobs listed by their creation date in ascending order.
Query parameters
-
from number
Starting offset (default: 0)
-
size number
Specifies a max number of results to get
-
status string
A sync job status to fetch connector sync jobs for
Values are
canceling
,canceled
,completed
,error
,in_progress
,pending
, orsuspended
. -
connector_id string
A connector id to fetch connector sync jobs for
-
job_type string | array[string]
A comma-separated list of job types to fetch the sync jobs for
curl \
--request GET http://api.example.com/_connector/_sync_job
Create a connector sync job Beta
Create a connector sync job document in the internal index and initialize its counters and timestamps with default values.
Body Required
-
job_type string
Values are
full
,incremental
, oraccess_control
. -
trigger_method string
Values are
on_demand
orscheduled
.
curl \
--request POST http://api.example.com/_connector/_sync_job \
--header "Content-Type: application/json" \
--data '"{\n \"id\": \"connector-id\",\n \"job_type\": \"full\",\n \"trigger_method\": \"on_demand\"\n}"'
{
"id": "connector-id",
"job_type": "full",
"trigger_method": "on_demand"
}
Activate the connector draft filter Technical preview
Activates the valid draft filtering for a connector.
Path parameters
-
The unique identifier of the connector to be updated
curl \
--request PUT http://api.example.com/_connector/{connector_id}/_filtering/_activate
Update the connector configuration Beta
Update the configuration field in the connector document.
Path parameters
-
The unique identifier of the connector to be updated
Body Required
-
configuration object
-
values object
curl \
--request PUT http://api.example.com/_connector/{connector_id}/_configuration \
--header "Content-Type: application/json" \
--data '"{\n \"values\": {\n \"tenant_id\": \"my-tenant-id\",\n \"tenant_name\": \"my-sharepoint-site\",\n \"client_id\": \"foo\",\n \"secret_value\": \"bar\",\n \"site_collections\": \"*\"\n }\n}"'
{
"values": {
"tenant_id": "my-tenant-id",
"tenant_name": "my-sharepoint-site",
"client_id": "foo",
"secret_value": "bar",
"site_collections": "*"
}
}
{
"values": {
"secret_value": "foo-bar"
}
}
{
"result": "updated"
}
Update the connector is_native flag Beta
Path parameters
-
The unique identifier of the connector to be updated
curl \
--request PUT http://api.example.com/_connector/{connector_id}/_native \
--header "Content-Type: application/json" \
--data '{"is_native":true}'
Update the connector pipeline Beta
When you create a new connector, the configuration of an ingest pipeline is populated with default settings.
Path parameters
-
The unique identifier of the connector to be updated
curl \
--request PUT http://api.example.com/_connector/{connector_id}/_pipeline \
--header "Content-Type: application/json" \
--data '"{\n \"pipeline\": {\n \"extract_binary_content\": true,\n \"name\": \"my-connector-pipeline\",\n \"reduce_whitespace\": true,\n \"run_ml_inference\": true\n }\n}"'
{
"pipeline": {
"extract_binary_content": true,
"name": "my-connector-pipeline",
"reduce_whitespace": true,
"run_ml_inference": true
}
}
{
"result": "updated"
}
Delete documents Added in 5.0.0
Deletes documents that match the specified query.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:
read
delete
orwrite
You can specify the query criteria in the request URI or the request body using the same syntax as the search API. When you submit a delete by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and deletes matching documents using internal versioning. If a document changes between the time that the snapshot is taken and the delete operation is processed, it results in a version conflict and the delete operation fails.
NOTE: Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.
While processing a delete by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents to delete. A bulk delete request is performed for each batch of matching documents. If a search or bulk request is rejected, the requests are retried up to 10 times, with exponential back off. If the maximum retry limit is reached, processing halts and all failed requests are returned in the response. Any delete requests that completed successfully still stick, they are not rolled back.
You can opt to count version conflicts instead of halting and returning by setting conflicts
to proceed
.
Note that if you opt to count version conflicts the operation could attempt to delete more documents from the source than max_docs
until it has successfully deleted max_docs documents
, or it has gone through every document in the source query.
Throttling delete requests
To control the rate at which delete by query issues batches of delete operations, you can set requests_per_second
to any positive decimal number.
This pads each batch with a wait time to throttle the rate.
Set requests_per_second
to -1
to disable throttling.
Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account.
The padding time is the difference between the batch size divided by the requests_per_second
and the time spent writing.
By default the batch size is 1000
, so if requests_per_second
is set to 500
:
target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
Since the batch is issued as a single _bulk
request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set.
This is "bursty" instead of "smooth".
Slicing
Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.
Setting slices
to auto
lets Elasticsearch choose the number of slices to use.
This setting will use one slice per shard, up to a certain limit.
If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards.
Adding slices to the delete by query operation creates sub-requests which means it has some quirks:
- You can see these requests in the tasks APIs. These sub-requests are "child" tasks of the task for the request with slices.
- Fetching the status of the task for the request with slices only contains the status of completed slices.
- These sub-requests are individually addressable for things like cancellation and rethrottling.
- Rethrottling the request with
slices
will rethrottle the unfinished sub-request proportionally. - Canceling the request with
slices
will cancel each sub-request. - Due to the nature of
slices
each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution. - Parameters like
requests_per_second
andmax_docs
on a request withslices
are distributed proportionally to each sub-request. Combine that with the earlier point about distribution being uneven and you should conclude that usingmax_docs
withslices
might not result in exactlymax_docs
documents being deleted. - Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.
If you're slicing manually or otherwise tuning automatic slicing, keep in mind that:
- Query performance is most efficient when the number of slices is equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too many
slices
hurts performance. Settingslices
higher than the number of shards generally does not improve efficiency and adds overhead. - Delete performance scales linearly across available resources with the number of slices.
Whether query or delete performance dominates the runtime depends on the documents being reindexed and cluster resources.
Cancel a delete by query operation
Any delete by query can be canceled using the task cancel API. For example:
POST _tasks/r1A2WoRbTwKZ516z6NEs5A:36619/_cancel
The task ID can be found by using the get tasks API.
Cancellation should happen quickly but might take a few seconds. The get task status API will continue to list the delete by query task until this task checks that it has been cancelled and terminates itself.
Path parameters
-
A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (
*
). To search all data streams or indices, omit this parameter or use*
or_all
.
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts withfoo
but no index starts withbar
. -
analyzer string
Analyzer to use for the query string. This parameter can be used only when the
q
query string parameter is specified. -
analyze_wildcard boolean
If
true
, wildcard and prefix queries are analyzed. This parameter can be used only when theq
query string parameter is specified. -
conflicts string
What to do if delete by query hits version conflicts:
abort
orproceed
.Values are
abort
orproceed
. -
default_operator string
The default operator for query string query:
AND
orOR
. This parameter can be used only when theq
query string parameter is specified.Values are
and
,AND
,or
, orOR
. -
df string
The field to use as default where no field prefix is given in the query string. This parameter can be used only when the
q
query string parameter is specified. -
expand_wildcards string | array[string]
The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as
open,hidden
. -
from number
Starting offset (default: 0)
-
lenient boolean
If
true
, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when theq
query string parameter is specified. -
max_docs number
The maximum number of documents to process. Defaults to all documents. When set to a value less then or equal to
scroll_size
, a scroll will not be used to retrieve the results for the operation. -
preference string
The node or shard the operation should be performed on. It is random by default.
-
refresh boolean
If
true
, Elasticsearch refreshes all shards involved in the delete by query after the request completes. This is different than the delete API'srefresh
parameter, which causes just the shard that received the delete request to be refreshed. Unlike the delete API, it does not supportwait_for
. -
request_cache boolean
If
true
, the request cache is used for this request. Defaults to the index-level setting. -
requests_per_second number
The throttle for this request in sub-requests per second.
-
routing string
A custom value used to route operations to a specific shard.
-
q string
A query in the Lucene query string syntax.
-
scroll string
The period to retain the search context for scrolling.
-
scroll_size number
The size of the scroll request that powers the operation.
-
search_timeout string
The explicit timeout for each search request. It defaults to no timeout.
-
search_type string
The type of the search operation. Available options include
query_then_fetch
anddfs_query_then_fetch
.Values are
query_then_fetch
ordfs_query_then_fetch
. -
slices number | string
The number of slices this task should be divided into.
-
sort array[string]
A comma-separated list of
<field>:<direction>
pairs. -
stats array[string]
The specific
tag
of the request for logging and statistical purposes. -
terminate_after number
The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.
Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.
-
timeout string
The period each deletion request waits for active shards.
-
version boolean
If
true
, returns the document version as part of a hit. -
wait_for_active_shards number | string
The number of shard copies that must be active before proceeding with the operation. Set to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). Thetimeout
value controls how long each write request waits for unavailable shards to become available. -
wait_for_completion boolean
If
true
, the request blocks until the operation is complete. Iffalse
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at.tasks/task/${taskId}
. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
Body Required
curl \
--request POST http://api.example.com/{index}/_delete_by_query \
--header "Content-Type: application/json" \
--data '"{\n \"query\": {\n \"match_all\": {}\n }\n}"'
{
"query": {
"match_all": {}
}
}
{
"query": {
"term": {
"user.id": "kimchy"
}
},
"max_docs": 1
}
{
"slice": {
"id": 0,
"max": 2
},
"query": {
"range": {
"http.response.bytes": {
"lt": 2000000
}
}
}
}
{
"query": {
"range": {
"http.response.bytes": {
"lt": 2000000
}
}
}
}
{
"took" : 147,
"timed_out": false,
"total": 119,
"deleted": 119,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1.0,
"throttled_until_millis": 0,
"failures" : [ ]
}
Create or update a document in an index
Add a JSON document to the specified data stream or index and make it searchable. If the target is an index and the document already exists, the request updates the document and increments its version.
NOTE: You cannot use this API to send update requests for existing documents in a data stream.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
- To add or overwrite a document using the
PUT /<target>/_doc/<_id>
request format, you must have thecreate
,index
, orwrite
index privilege. - To add a document using the
POST /<target>/_doc/
request format, you must have thecreate_doc
,create
,index
, orwrite
index privilege. - To automatically create a data stream or index with this API request, you must have the
auto_configure
,create_index
, ormanage
index privilege.
Automatic data stream creation requires a matching index template with data stream enabled.
NOTE: Replica shards might not all be started when an indexing operation returns successfully.
By default, only the primary is required. Set wait_for_active_shards
to change this default behavior.
Automatically create data streams and indices
If the request's target doesn't exist and matches an index template with a data_stream
definition, the index operation automatically creates the data stream.
If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.
NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.
If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.
Automatic index creation is controlled by the action.auto_create_index
setting.
If it is true
, any index can be created automatically.
You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false
to turn off automatic index creation entirely.
Specify a comma-separated list of patterns you want to allow or prefix each pattern with +
or -
to indicate whether it should be allowed or blocked.
When a list is specified, the default behaviour is to disallow.
NOTE: The action.auto_create_index
setting affects the automatic creation of indices only.
It does not affect the creation of data streams.
Optimistic concurrency control
Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no
and if_primary_term
parameters.
If a mismatch is detected, the operation will result in a VersionConflictException
and a status code of 409
.
Routing
By default, shard placement — or routing — is controlled by using a hash of the document's ID value.
For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing
parameter.
When setting up explicit mapping, you can also use the _routing
field to direct the index operation to extract the routing value from the document itself.
This does come at the (very minimal) cost of an additional document parsing pass.
If the _routing
mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Distributed
The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.
Active shards
To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation.
If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs.
By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards
is 1
).
This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards
.
To alter this behavior per operation, use the wait_for_active_shards request
parameter.
Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas
+1).
Specifying a negative value or a number greater than the number of shard copies will throw an error.
For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes).
If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding.
This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data.
If wait_for_active_shards
is set on the request to 3
(and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding.
This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard.
However, if you set wait_for_active_shards
to all
(or to 4
, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index.
The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.
It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts.
After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary.
The _shards
section of the API response reveals the number of shard copies on which replication succeeded and failed.
No operation (noop) updates
When updating a document by using this API, a new version of the document is always created even if the document hasn't changed.
If this isn't acceptable use the _update
API with detect_noop
set to true
.
The detect_noop
option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.
There isn't a definitive rule for when noop updates aren't acceptable. It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.
Versioning
Each indexed document is given a version number.
By default, internal versioning is used that starts at 1 and increments with each update, deletes included.
Optionally, the version number can be set to an external value (for example, if maintained in a database).
To enable this functionality, version_type
should be set to external
.
The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18
.
NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. If no version is provided, the operation runs without any version checks.
When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number used. If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:
PUT my-index-000001/_doc/1?version=2&version_type=external
{
"user": {
"id": "elkbee"
}
}
In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).
A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
Path parameters
-
The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (
*
) pattern of an index template with adata_stream
definition, this request creates the data stream. If the target doesn't exist and doesn't match a data stream template, this request creates the index. You can check for existing targets with the resolve index API.
Query parameters
-
if_primary_term number
Only perform the operation if the document has this primary term.
-
if_seq_no number
Only perform the operation if the document has this sequence number.
-
include_source_on_error boolean
True or false if to include the document source in the error message in case of parsing errors.
-
op_type string
Set to
create
to only index the document if it does not already exist (put if absent). If a document with the specified_id
already exists, the indexing operation will fail. The behavior is the same as using the<index>/_create
endpoint. If a document ID is specified, this paramater defaults toindex
. Otherwise, it defaults tocreate
. If the request targets a data stream, anop_type
ofcreate
is required.Values are
index
orcreate
. -
pipeline string
The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to
_none
disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter. -
refresh string
If
true
, Elasticsearch refreshes the affected shards to make this operation visible to search. Ifwait_for
, it waits for a refresh to make this operation visible to search. Iffalse
, it does nothing with refreshes.Values are
true
,false
, orwait_for
. -
routing string
A custom value that is used to route operations to a specific shard.
-
timeout string
The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.
This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.
-
version number
An explicit version number for concurrency control. It must be a non-negative long number.
-
version_type string
The version type.
Values are
internal
,external
,external_gte
, orforce
. -
wait_for_active_shards number | string
The number of shard copies that must be active before proceeding with the operation. You can set it to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). The default value of1
means it waits for each primary shard to be active. -
require_alias boolean
If
true
, the destination must be an index alias.
curl \
--request POST http://api.example.com/{index}/_doc \
--header "Content-Type: application/json" \
--data '"{\n \"@timestamp\": \"2099-11-15T13:12:00\",\n \"message\": \"GET /search HTTP/1.1 200 1070000\",\n \"user\": {\n \"id\": \"kimchy\"\n }\n}"'
{
"@timestamp": "2099-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
{
"@timestamp": "2099-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
{
"_shards": {
"total": 2,
"failed": 0,
"successful": 2
},
"_index": "my-index-000001",
"_id": "W0tpsmIBdwcYyG50zbta",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"result": "created"
}
{
"_shards": {
"total": 2,
"failed": 0,
"successful": 2
},
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"result": "created"
}
Get multiple documents Added in 1.3.0
Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.
Filter source fields
By default, the _source
field is returned for every document (if stored).
Use the _source
and _source_include
or source_exclude
attributes to filter what fields are returned for a particular document.
You can include the _source
, _source_includes
, and _source_excludes
query parameters in the request URI to specify the defaults to use when there are no per-document instructions.
Get stored fields
Use the stored_fields
attribute to specify the set of stored fields you want to retrieve.
Any requested fields that are not stored are ignored.
You can include the stored_fields
query parameter in the request URI to specify the defaults to use when there are no per-document instructions.
Query parameters
-
preference string
Specifies the node or shard the operation should be performed on. Random by default.
-
realtime boolean
If
true
, the request is real-time as opposed to near-real-time. -
refresh boolean
If
true
, the request refreshes relevant shards before retrieving documents. -
routing string
Custom value used to route operations to a specific shard.
-
_source boolean | string | array[string]
True or false to return the
_source
field or not, or a list of fields to return. -
_source_excludes string | array[string]
A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in
_source_includes
query parameter. -
_source_includes string | array[string]
A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the
_source_excludes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
stored_fields string | array[string]
If
true
, retrieves the document fields stored in the index rather than the document_source
.
Body Required
curl \
--request GET http://api.example.com/_mget \
--header "Content-Type: application/json" \
--data '"{\n \"docs\": [\n {\n \"_id\": \"1\"\n },\n {\n \"_id\": \"2\"\n }\n ]\n}"'
{
"docs": [
{
"_id": "1"
},
{
"_id": "2"
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"_source": false
},
{
"_index": "test",
"_id": "2",
"_source": [ "field3", "field4" ]
},
{
"_index": "test",
"_id": "3",
"_source": {
"include": [ "user" ],
"exclude": [ "user.location" ]
}
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"stored_fields": [ "field1", "field2" ]
},
{
"_index": "test",
"_id": "2",
"stored_fields": [ "field3", "field4" ]
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"routing": "key2"
},
{
"_index": "test",
"_id": "2"
}
]
}
Get multiple documents Added in 1.3.0
Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.
Filter source fields
By default, the _source
field is returned for every document (if stored).
Use the _source
and _source_include
or source_exclude
attributes to filter what fields are returned for a particular document.
You can include the _source
, _source_includes
, and _source_excludes
query parameters in the request URI to specify the defaults to use when there are no per-document instructions.
Get stored fields
Use the stored_fields
attribute to specify the set of stored fields you want to retrieve.
Any requested fields that are not stored are ignored.
You can include the stored_fields
query parameter in the request URI to specify the defaults to use when there are no per-document instructions.
Query parameters
-
preference string
Specifies the node or shard the operation should be performed on. Random by default.
-
realtime boolean
If
true
, the request is real-time as opposed to near-real-time. -
refresh boolean
If
true
, the request refreshes relevant shards before retrieving documents. -
routing string
Custom value used to route operations to a specific shard.
-
_source boolean | string | array[string]
True or false to return the
_source
field or not, or a list of fields to return. -
_source_excludes string | array[string]
A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in
_source_includes
query parameter. -
_source_includes string | array[string]
A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the
_source_excludes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
stored_fields string | array[string]
If
true
, retrieves the document fields stored in the index rather than the document_source
.
Body Required
curl \
--request POST http://api.example.com/_mget \
--header "Content-Type: application/json" \
--data '"{\n \"docs\": [\n {\n \"_id\": \"1\"\n },\n {\n \"_id\": \"2\"\n }\n ]\n}"'
{
"docs": [
{
"_id": "1"
},
{
"_id": "2"
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"_source": false
},
{
"_index": "test",
"_id": "2",
"_source": [ "field3", "field4" ]
},
{
"_index": "test",
"_id": "3",
"_source": {
"include": [ "user" ],
"exclude": [ "user.location" ]
}
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"stored_fields": [ "field1", "field2" ]
},
{
"_index": "test",
"_id": "2",
"stored_fields": [ "field3", "field4" ]
}
]
}
{
"docs": [
{
"_index": "test",
"_id": "1",
"routing": "key2"
},
{
"_index": "test",
"_id": "2"
}
]
}
Get multiple term vectors
Get multiple term vectors with a single request.
You can specify existing documents by index and ID or provide artificial documents in the body of the request.
You can specify the index in the request body or request URI.
The response contains a docs
array with all the fetched termvectors.
Each element has the structure provided by the termvectors API.
Artificial documents
You can also use mtermvectors
to generate term vectors for artificial documents provided in the body of the request.
The mapping used is determined by the specified _index
.
Query parameters
-
ids array[string]
A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body
-
fields string | array[string]
A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the
completion_fields
orfielddata_fields
parameters. -
field_statistics boolean
If
true
, the response includes the document count, sum of document frequencies, and sum of total term frequencies. -
offsets boolean
If
true
, the response includes term offsets. -
payloads boolean
If
true
, the response includes term payloads. -
positions boolean
If
true
, the response includes term positions. -
preference string
The node or shard the operation should be performed on. It is random by default.
-
realtime boolean
If true, the request is real-time as opposed to near-real-time.
-
routing string
A custom value used to route operations to a specific shard.
-
term_statistics boolean
If true, the response includes term frequency and document frequency.
-
version number
If
true
, returns the document version as part of a hit. -
version_type string
The version type.
Values are
internal
,external
,external_gte
, orforce
.
curl \
--request GET http://api.example.com/_mtermvectors \
--header "Content-Type: application/json" \
--data '"{\n \"docs\": [\n {\n \"_id\": \"2\",\n \"fields\": [\n \"message\"\n ],\n \"term_statistics\": true\n },\n {\n \"_id\": \"1\"\n }\n ]\n}"'
{
"docs": [
{
"_id": "2",
"fields": [
"message"
],
"term_statistics": true
},
{
"_id": "1"
}
]
}
{
"ids": [ "1", "2" ],
"parameters": {
"fields": [
"message"
],
"term_statistics": true
}
}
{
"docs": [
{
"_index": "my-index-000001",
"doc" : {
"message" : "test test test"
}
},
{
"_index": "my-index-000001",
"doc" : {
"message" : "Another test ..."
}
}
]
}
Get multiple term vectors
Get multiple term vectors with a single request.
You can specify existing documents by index and ID or provide artificial documents in the body of the request.
You can specify the index in the request body or request URI.
The response contains a docs
array with all the fetched termvectors.
Each element has the structure provided by the termvectors API.
Artificial documents
You can also use mtermvectors
to generate term vectors for artificial documents provided in the body of the request.
The mapping used is determined by the specified _index
.
Path parameters
-
The name of the index that contains the documents.
Query parameters
-
ids array[string]
A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body
-
fields string | array[string]
A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the
completion_fields
orfielddata_fields
parameters. -
field_statistics boolean
If
true
, the response includes the document count, sum of document frequencies, and sum of total term frequencies. -
offsets boolean
If
true
, the response includes term offsets. -
payloads boolean
If
true
, the response includes term payloads. -
positions boolean
If
true
, the response includes term positions. -
preference string
The node or shard the operation should be performed on. It is random by default.
-
realtime boolean
If true, the request is real-time as opposed to near-real-time.
-
routing string
A custom value used to route operations to a specific shard.
-
term_statistics boolean
If true, the response includes term frequency and document frequency.
-
version number
If
true
, returns the document version as part of a hit. -
version_type string
The version type.
Values are
internal
,external
,external_gte
, orforce
.
curl \
--request GET http://api.example.com/{index}/_mtermvectors \
--header "Content-Type: application/json" \
--data '"{\n \"docs\": [\n {\n \"_id\": \"2\",\n \"fields\": [\n \"message\"\n ],\n \"term_statistics\": true\n },\n {\n \"_id\": \"1\"\n }\n ]\n}"'
{
"docs": [
{
"_id": "2",
"fields": [
"message"
],
"term_statistics": true
},
{
"_id": "1"
}
]
}
{
"ids": [ "1", "2" ],
"parameters": {
"fields": [
"message"
],
"term_statistics": true
}
}
{
"docs": [
{
"_index": "my-index-000001",
"doc" : {
"message" : "test test test"
}
},
{
"_index": "my-index-000001",
"doc" : {
"message" : "Another test ..."
}
}
]
}
Reindex documents Added in 2.3.0
Copy documents from a source to a destination. You can copy all documents to the destination index or reindex a subset of the documents. The source can be any existing index, alias, or data stream. The destination must differ from the source. For example, you cannot reindex a data stream into itself.
IMPORTANT: Reindex requires _source
to be enabled for all documents in the source.
The destination should be configured as wanted before calling the reindex API.
Reindex does not copy the settings from the source or its associated template.
Mappings, shard counts, and replicas, for example, must be configured ahead of time.
If the Elasticsearch security features are enabled, you must have the following security privileges:
- The
read
index privilege for the source data stream, index, or alias. - The
write
index privilege for the destination data stream, index, or index alias. - To automatically create a data stream or index with a reindex API request, you must have the
auto_configure
,create_index
, ormanage
index privilege for the destination data stream, index, or alias. - If reindexing from a remote cluster, the
source.remote.user
must have themonitor
cluster privilege and theread
index privilege for the source data stream, index, or alias.
If reindexing from a remote cluster, you must explicitly allow the remote host in the reindex.remote.whitelist
setting.
Automatic data stream creation requires a matching index template with data stream enabled.
The dest
element can be configured like the index API to control optimistic concurrency control.
Omitting version_type
or setting it to internal
causes Elasticsearch to blindly dump documents into the destination, overwriting any that happen to have the same ID.
Setting version_type
to external
causes Elasticsearch to preserve the version
from the source, create any documents that are missing, and update any documents that have an older version in the destination than they do in the source.
Setting op_type
to create
causes the reindex API to create only missing documents in the destination.
All existing documents will cause a version conflict.
IMPORTANT: Because data streams are append-only, any reindex request to a destination data stream must have an op_type
of create
.
A reindex can only add new documents to a destination data stream.
It cannot update existing documents in a destination data stream.
By default, version conflicts abort the reindex process.
To continue reindexing if there are conflicts, set the conflicts
request body property to proceed
.
In this case, the response includes a count of the version conflicts that were encountered.
Note that the handling of other error types is unaffected by the conflicts
property.
Additionally, if you opt to count version conflicts, the operation could attempt to reindex more documents from the source than max_docs
until it has successfully indexed max_docs
documents into the target or it has gone through every document in the source query.
NOTE: The reindex API makes no effort to handle ID collisions. The last document written will "win" but the order isn't usually predictable so it is not a good idea to rely on this behavior. Instead, make sure that IDs are unique by using a script.
Running reindex asynchronously
If the request contains wait_for_completion=false
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task.
Elasticsearch creates a record of this task as a document at _tasks/<task_id>
.
Reindex from multiple sources
If you have many sources to reindex it is generally better to reindex them one at a time rather than using a glob pattern to pick up multiple sources. That way you can resume the process if there are any errors by removing the partially completed source and starting over. It also makes parallelizing the process fairly simple: split the list of sources to reindex and run each list in parallel.
For example, you can use a bash script like this:
for index in i1 i2 i3 i4 i5; do
curl -HContent-Type:application/json -XPOST localhost:9200/_reindex?pretty -d'{
"source": {
"index": "'$index'"
},
"dest": {
"index": "'$index'-reindexed"
}
}'
done
Throttling
Set requests_per_second
to any positive decimal number (1.4
, 6
, 1000
, for example) to throttle the rate at which reindex issues batches of index operations.
Requests are throttled by padding each batch with a wait time.
To turn off throttling, set requests_per_second
to -1
.
The throttling is done by waiting between batches so that the scroll that reindex uses internally can be given a timeout that takes into account the padding.
The padding time is the difference between the batch size divided by the requests_per_second
and the time spent writing.
By default the batch size is 1000
, so if requests_per_second
is set to 500
:
target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
Since the batch is issued as a single bulk request, large batch sizes cause Elasticsearch to create many requests and then wait for a while before starting the next set. This is "bursty" instead of "smooth".
Slicing
Reindex supports sliced scroll to parallelize the reindexing process. This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.
NOTE: Reindexing from remote clusters does not support manual or automatic slicing.
You can slice a reindex request manually by providing a slice ID and total number of slices to each request.
You can also let reindex automatically parallelize by using sliced scroll to slice on _id
.
The slices
parameter specifies the number of slices to use.
Adding slices
to the reindex request just automates the manual process, creating sub-requests which means it has some quirks:
- You can see these requests in the tasks API. These sub-requests are "child" tasks of the task for the request with slices.
- Fetching the status of the task for the request with
slices
only contains the status of completed slices. - These sub-requests are individually addressable for things like cancellation and rethrottling.
- Rethrottling the request with
slices
will rethrottle the unfinished sub-request proportionally. - Canceling the request with
slices
will cancel each sub-request. - Due to the nature of
slices
, each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution. - Parameters like
requests_per_second
andmax_docs
on a request withslices
are distributed proportionally to each sub-request. Combine that with the previous point about distribution being uneven and you should conclude that usingmax_docs
withslices
might not result in exactlymax_docs
documents being reindexed. - Each sub-request gets a slightly different snapshot of the source, though these are all taken at approximately the same time.
If slicing automatically, setting slices
to auto
will choose a reasonable number for most indices.
If slicing manually or otherwise tuning automatic slicing, use the following guidelines.
Query performance is most efficient when the number of slices is equal to the number of shards in the index.
If that number is large (for example, 500
), choose a lower number as too many slices will hurt performance.
Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.
Indexing performance scales linearly across available resources with the number of slices.
Whether query or indexing performance dominates the runtime depends on the documents being reindexed and cluster resources.
Modify documents during reindexing
Like _update_by_query
, reindex operations support a script that modifies the document.
Unlike _update_by_query
, the script is allowed to modify the document's metadata.
Just as in _update_by_query
, you can set ctx.op
to change the operation that is run on the destination.
For example, set ctx.op
to noop
if your script decides that the document doesn’t have to be indexed in the destination. This "no operation" will be reported in the noop
counter in the response body.
Set ctx.op
to delete
if your script decides that the document must be deleted from the destination.
The deletion will be reported in the deleted
counter in the response body.
Setting ctx.op
to anything else will return an error, as will setting any other field in ctx
.
Think of the possibilities! Just be careful; you are able to change:
_id
_index
_version
_routing
Setting _version
to null
or clearing it from the ctx
map is just like not sending the version in an indexing request.
It will cause the document to be overwritten in the destination regardless of the version on the target or the version type you use in the reindex API.
Reindex from remote
Reindex supports reindexing from a remote Elasticsearch cluster.
The host
parameter must contain a scheme, host, port, and optional path.
The username
and password
parameters are optional and when they are present the reindex operation will connect to the remote Elasticsearch node using basic authentication.
Be sure to use HTTPS when using basic authentication or the password will be sent in plain text.
There are a range of settings available to configure the behavior of the HTTPS connection.
When using Elastic Cloud, it is also possible to authenticate against the remote cluster through the use of a valid API key.
Remote hosts must be explicitly allowed with the reindex.remote.whitelist
setting.
It can be set to a comma delimited list of allowed remote host and port combinations.
Scheme is ignored; only the host and port are used.
For example:
reindex.remote.whitelist: [otherhost:9200, another:9200, 127.0.10.*:9200, localhost:*"]
The list of allowed hosts must be configured on any nodes that will coordinate the reindex. This feature should work with remote clusters of any version of Elasticsearch. This should enable you to upgrade from any version of Elasticsearch to the current version by reindexing from a cluster of the old version.
WARNING: Elasticsearch does not support forward compatibility across major versions. For example, you cannot reindex from a 7.x cluster into a 6.x cluster.
To enable queries sent to older versions of Elasticsearch, the query
parameter is sent directly to the remote host without validation or modification.
NOTE: Reindexing from remote clusters does not support manual or automatic slicing.
Reindexing from a remote server uses an on-heap buffer that defaults to a maximum size of 100mb.
If the remote index includes very large documents you'll need to use a smaller batch size.
It is also possible to set the socket read timeout on the remote connection with the socket_timeout
field and the connection timeout with the connect_timeout
field.
Both default to 30 seconds.
Configuring SSL parameters
Reindex from remote supports configurable SSL settings.
These must be specified in the elasticsearch.yml
file, with the exception of the secure settings, which you add in the Elasticsearch keystore.
It is not possible to configure SSL in the body of the reindex request.
Query parameters
-
refresh boolean
If
true
, the request refreshes affected shards to make this operation visible to search. -
requests_per_second number
The throttle for this request in sub-requests per second. By default, there is no throttle.
-
scroll string
The period of time that a consistent view of the index should be maintained for scrolled search.
-
slices number | string
The number of slices this task should be divided into. It defaults to one slice, which means the task isn't sliced into subtasks.
Reindex supports sliced scroll to parallelize the reindexing process. This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.
NOTE: Reindexing from remote clusters does not support manual or automatic slicing.
If set to
auto
, Elasticsearch chooses the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple sources, it will choose the number of slices based on the index or backing index with the smallest number of shards. -
timeout string
The period each indexing waits for automatic index creation, dynamic mapping updates, and waiting for active shards. By default, Elasticsearch waits for at least one minute before failing. The actual wait time could be longer, particularly when multiple waits occur.
-
wait_for_active_shards number | string
The number of shard copies that must be active before proceeding with the operation. Set it to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). The default value is one, which means it waits for each primary shard to be active. -
wait_for_completion boolean
If
true
, the request blocks until the operation is complete. -
require_alias boolean
If
true
, the destination must be an index alias.
Body Required
-
conflicts string
Values are
abort
orproceed
. -
Additional properties are allowed.
-
max_docs number
The maximum number of documents to reindex. By default, all documents are reindexed. If it is a value less then or equal to
scroll_size
, a scroll will not be used to retrieve the results for the operation.If
conflicts
is set toproceed
, the reindex operation could attempt to reindex more documents from the source thanmax_docs
until it has successfully indexedmax_docs
documents into the target or it has gone through every document in the source query. -
script object
Additional properties are allowed.
-
size number
-
Additional properties are allowed.
curl \
--request POST http://api.example.com/_reindex \
--header "Content-Type: application/json" \
--data '"{\n \"source\": {\n \"index\": [\"my-index-000001\", \"my-index-000002\"]\n },\n \"dest\": {\n \"index\": \"my-new-index-000002\"\n }\n}"'
{
"source": {
"index": ["my-index-000001", "my-index-000002"]
},
"dest": {
"index": "my-new-index-000002"
}
}
{
"source": {
"index": "my-index-000001",
"slice": {
"id": 0,
"max": 2
}
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"source": {
"index": "my-index-000001"
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"source": {
"index": "source",
"query": {
"match": {
"company": "cat"
}
}
},
"dest": {
"index": "dest",
"routing": "=cat"
}
}
{
"source": {
"index": "source"
},
"dest": {
"index": "dest",
"pipeline": "some_ingest_pipeline"
}
}
{
"source": {
"index": "my-index-000001",
"query": {
"term": {
"user.id": "kimchy"
}
}
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"max_docs": 1,
"source": {
"index": "my-index-000001"
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"source": {
"index": "my-index-000001",
"_source": ["user.id", "_doc"]
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"source": {
"index": "my-index-000001"
},
"dest": {
"index": "my-new-index-000001"
},
"script": {
"source": "ctx._source.tag = ctx._source.remove(\"flag\")"
}
}
{
"source": {
"index": "metricbeat-*"
},
"dest": {
"index": "metricbeat"
},
"script": {
"lang": "painless",
"source": "ctx._index = 'metricbeat-' + (ctx._index.substring('metricbeat-'.length(), ctx._index.length())) + '-1'"
}
}
{
"max_docs": 10,
"source": {
"index": "my-index-000001",
"query": {
"function_score" : {
"random_score" : {},
"min_score" : 0.9
}
}
},
"dest": {
"index": "my-new-index-000001"
}
}
{
"source": {
"index": "my-index-000001"
},
"dest": {
"index": "my-new-index-000001",
"version_type": "external"
},
"script": {
"source": "if (ctx._source.foo == 'bar') {ctx._version++; ctx._source.remove('foo')}",
"lang": "painless"
}
}
Get term vector information
Get information and statistics about terms in the fields of a particular document.
You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request.
You can specify the fields you are interested in through the fields
parameter or by adding the fields to the request body.
For example:
GET /my-index-000001/_termvectors/1?fields=message
Fields can be specified using wildcards, similar to the multi match query.
Term vectors are real-time by default, not near real-time.
This can be changed by setting realtime
parameter to false
.
You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.
Term information
- term frequency in the field (always returned)
- term positions (
positions: true
) - start and end offsets (
offsets: true
) - term payloads (
payloads: true
), as base64 encoded bytes
If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.
Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.
Behaviour
The term and field statistics are not accurate.
Deleted documents are not taken into account.
The information is only retrieved for the shard the requested document resides in.
The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context.
By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected.
Use routing
only to hit a particular shard.
Path parameters
-
The name of the index that contains the document.
-
A unique identifier for the document.
Query parameters
-
fields string | array[string]
A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the
completion_fields
orfielddata_fields
parameters. -
field_statistics boolean
If
true
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
-
offsets boolean
If
true
, the response includes term offsets. -
payloads boolean
If
true
, the response includes term payloads. -
positions boolean
If
true
, the response includes term positions. -
preference string
The node or shard the operation should be performed on. It is random by default.
-
realtime boolean
If true, the request is real-time as opposed to near-real-time.
-
routing string
A custom value that is used to route operations to a specific shard.
-
term_statistics boolean
If
true
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
-
version number
If
true
, returns the document version as part of a hit. -
version_type string
The version type.
Values are
internal
,external
,external_gte
, orforce
.
Body
-
doc object
An artificial document (a document not present in the index) for which you want to retrieve term vectors.
Additional properties are allowed.
-
filter object
Additional properties are allowed.
-
per_field_analyzer object
Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.
curl \
--request POST http://api.example.com/{index}/_termvectors/{id} \
--header "Content-Type: application/json" \
--data '"{\n \"fields\" : [\"text\"],\n \"offsets\" : true,\n \"payloads\" : true,\n \"positions\" : true,\n \"term_statistics\" : true,\n \"field_statistics\" : true\n}"'
{
"fields" : ["text"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
},
"fields": ["fullname"],
"per_field_analyzer" : {
"fullname": "keyword"
}
}
{
"doc": {
"plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
},
"term_statistics": true,
"field_statistics": true,
"positions": false,
"offsets": false,
"filter": {
"max_num_terms": 3,
"min_term_freq": 1,
"min_doc_freq": 1
}
}
{
"fields" : ["text", "some_field_without_term_vectors"],
"offsets" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
}
}
{
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"found": true,
"took": 6,
"term_vectors": {
"text": {
"field_statistics": {
"sum_doc_freq": 4,
"doc_count": 2,
"sum_ttf": 6
},
"terms": {
"test": {
"doc_freq": 2,
"ttf": 4,
"term_freq": 3,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 4,
"payload": "d29yZA=="
},
{
"position": 1,
"start_offset": 5,
"end_offset": 9,
"payload": "d29yZA=="
},
{
"position": 2,
"start_offset": 10,
"end_offset": 14,
"payload": "d29yZA=="
}
]
}
}
}
}
}
{
"_index": "my-index-000001",
"_version": 0,
"found": true,
"took": 6,
"term_vectors": {
"fullname": {
"field_statistics": {
"sum_doc_freq": 2,
"doc_count": 4,
"sum_ttf": 4
},
"terms": {
"John Doe": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 8
}
]
}
}
}
}
}
{
"_index": "imdb",
"_version": 0,
"found": true,
"term_vectors": {
"plot": {
"field_statistics": {
"sum_doc_freq": 3384269,
"doc_count": 176214,
"sum_ttf": 3753460
},
"terms": {
"armored": {
"doc_freq": 27,
"ttf": 27,
"term_freq": 1,
"score": 9.74725
},
"industrialist": {
"doc_freq": 88,
"ttf": 88,
"term_freq": 1,
"score": 8.590818
},
"stark": {
"doc_freq": 44,
"ttf": 47,
"term_freq": 1,
"score": 9.272792
}
}
}
}
}
Query parameters
-
master_timeout string
Period to wait for a connection to the master node.
curl \
--request GET http://api.example.com/_enrich/policy
Delete an async EQL search Added in 7.9.0
Delete an async EQL search or a stored synchronous EQL search. The API also deletes results for the search.
Path parameters
-
Identifier for the search to delete. A search ID is provided in the EQL search API's response for an async search. A search ID is also provided if the request’s
keep_on_completion
parameter istrue
.
curl \
--request DELETE http://api.example.com/_eql/search/{id}
Graph explore
The graph explore API enables you to extract and summarize information about the documents and terms in an Elasticsearch data stream or index.
Explore graph analytics
Extract and summarize information about the documents and terms in an Elasticsearch data stream or index.
The easiest way to understand the behavior of this API is to use the Graph UI to explore connections.
An initial request to the _explore
API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph.
Subsequent requests enable you to spider out from one more vertices of interest.
You can exclude vertices that have already been returned.
Path parameters
-
Name of the index.
Body
-
connections object
Additional properties are allowed.
-
controls object
Additional properties are allowed.
-
query object
An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
Additional properties are allowed.
-
vertices array[object]
Specifies one or more fields that contain the terms you want to include in the graph as vertices.
curl \
--request GET http://api.example.com/{index}/_graph/explore \
--header "Content-Type: application/json" \
--data '"{\n \"query\": {\n \"match\": {\n \"query.raw\": \"midi\"\n }\n },\n \"vertices\": [\n {\n \"field\": \"product\"\n }\n ],\n \"connections\": {\n \"vertices\": [\n {\n \"field\": \"query.raw\"\n }\n ]\n }\n}"'
{
"query": {
"match": {
"query.raw": "midi"
}
},
"vertices": [
{
"field": "product"
}
],
"connections": {
"vertices": [
{
"field": "query.raw"
}
]
}
}
Create or update a component template Added in 7.8.0
Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
An index template can be composed of multiple component templates.
To use a component template, specify it in an index template’s composed_of
list.
Component templates are only applied to new data streams and indices as part of a matching index template.
Settings and mappings specified directly in the index template or the create index request override any settings or mappings specified in a component template.
Component templates are only used during index creation. For data streams, this includes data stream creation and the creation of a stream’s backing indices. Changes to component templates do not affect existing indices, including a stream’s backing indices.
You can use C-style /* *\/
block comments in component templates.
You can include comments anywhere in the request body except before the opening curly bracket.
Applying component templates
You cannot directly apply a component template to a data stream or index.
To be applied, a component template must be included in an index template's composed_of
list.
Path parameters
-
Name of the component template to create. Elasticsearch includes the following built-in component templates:
logs-mappings
;logs-settings
;metrics-mappings
;metrics-settings
;synthetics-mapping
;synthetics-settings
. Elastic Agent uses these templates to configure backing indices for its data streams. If you use Elastic Agent and want to overwrite one of these templates, set theversion
for your replacement template higher than the current version. If you don’t use Elastic Agent and want to disable all built-in component and index templates, setstack.templates.enabled
tofalse
using the cluster update settings API.
Query parameters
-
create boolean
If
true
, this request cannot replace or update existing component templates. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
Body Required
-
Additional properties are allowed.
-
version number
-
_meta object
-
deprecated boolean
Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.
curl \
--request PUT http://api.example.com/_component_template/{name} \
--header "Content-Type: application/json" \
--data '{"mappings":{"_source":{"enabled":false},"properties":{"host_name":{"type":"keyword"},"created_at":{"type":"date","format":"EEE MMM dd HH:mm:ss Z yyyy"}}},"settings":{"number_of_shards":1},"template":null}'
{
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z yyyy"
}
}
},
"settings": {
"number_of_shards": 1
},
"template": null
}
{
"aliases": {
"alias1": {},
"alias2": {
"filter": {
"term": {
"user.id": "kimchy"
}
},
"routing": "shard-1"
},
"{index}-alias": {}
},
"settings": {
"number_of_shards": 1
},
"template": null
}
Delete component templates Added in 7.8.0
Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
Path parameters
-
Comma-separated list or wildcard expression of component template names used to limit the request.
Query parameters
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request DELETE http://api.example.com/_component_template/{name}
Get component templates Added in 7.8.0
Get information about component templates.
Query parameters
-
flat_settings boolean
If
true
, returns settings in flat format. -
include_defaults boolean
Return all default configurations for the component template (default: false)
-
local boolean
If
true
, the request retrieves information from the local node only. Iffalse
, information is retrieved from the master node. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request GET http://api.example.com/_component_template
Get tokens from text analysis
The analyze API performs analysis on a text string and returns the resulting tokens.
Generating excessive amount of tokens may cause a node to run out of memory.
The index.analyze.max_token_count
setting enables you to limit the number of tokens that can be produced.
If more than this limit of tokens gets generated, an error occurs.
The _analyze
endpoint without a specified index will always use 10000
as its limit.
Path parameters
-
Index used to derive the analyzer. If specified, the
analyzer
or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.
Body
-
analyzer string
The name of the analyzer that should be applied to the provided
text
. This could be a built-in analyzer, or an analyzer that’s been configured in the index. -
attributes array[string]
Array of token attributes used to filter the output of the
explain
parameter. -
char_filter array
Array of character filters used to preprocess characters before the tokenizer.
-
explain boolean
If
true
, the response includes token attributes and additional details. -
field string
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
-
filter array
Array of token filters used to apply after the tokenizer.
-
normalizer string
Normalizer to use to convert text into a single token.
text string | array[string]
curl \
--request POST http://api.example.com/{index}/_analyze \
--header "Content-Type: application/json" \
--data '{"text":"this is a test","analyzer":"standard"}'
{
"text": "this is a test",
"analyzer": "standard"
}
Path parameters
-
Comma-separated list of data streams, indices, and aliases. Supports wildcards (
*
).
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. -
expand_wildcards string | array[string]
Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. Valid values are:all
,open
,closed
,hidden
,none
. -
flat_settings boolean
If
true
, returns settings in flat format. -
include_defaults boolean
If
true
, return all default settings in the response. -
local boolean
If
true
, the request retrieves information from the local node only.
curl \
--request HEAD http://api.example.com/{index}
Path parameters
-
Comma-separated list of data streams or indices to add. Supports wildcards (
*
). Wildcard patterns that match both data streams and indices return an error. -
Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.
Query parameters
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Body
-
filter object
An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
Additional properties are allowed.
-
index_routing string
-
is_write_index boolean
If
true
, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams andis_write_index
isn’t set, the alias rejects write requests. If an index alias points to one index andis_write_index
isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream. -
routing string
-
search_routing string
curl \
--request PUT http://api.example.com/{index}/_alias/{name} \
--header "Content-Type: application/json" \
--data '{"filter":{},"index_routing":"string","is_write_index":true,"routing":"string","search_routing":"string"}'
Path parameters
-
Comma-separated list of data streams or indices to add. Supports wildcards (
*
). Wildcard patterns that match both data streams and indices return an error. -
Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.
Query parameters
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Body
-
filter object
An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
Additional properties are allowed.
-
index_routing string
-
is_write_index boolean
If
true
, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams andis_write_index
isn’t set, the alias rejects write requests. If an index alias points to one index andis_write_index
isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream. -
routing string
-
search_routing string
curl \
--request POST http://api.example.com/{index}/_aliases/{name} \
--header "Content-Type: application/json" \
--data '{"filter":{},"index_routing":"string","is_write_index":true,"routing":"string","search_routing":"string"}'
Path parameters
-
Comma-separated list of data streams or indices used to limit the request. Supports wildcards (
*
). -
Comma-separated list of aliases to remove. Supports wildcards (
*
). To remove all aliases, use*
or_all
.
Query parameters
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request DELETE http://api.example.com/{index}/_aliases/{name}
Get index templates Added in 7.9.0
Get information about one or more index templates.
Path parameters
-
Comma-separated list of index template names used to limit the request. Wildcard (*) expressions are supported.
Query parameters
-
local boolean
If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.
-
flat_settings boolean
If true, returns settings in flat format.
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
include_defaults boolean
If true, returns all relevant default configurations for the index template.
curl \
--request GET http://api.example.com/_index_template/{name}
Get index templates Added in 7.9.0
Get information about one or more index templates.
Query parameters
-
local boolean
If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.
-
flat_settings boolean
If true, returns settings in flat format.
-
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
include_defaults boolean
If true, returns all relevant default configurations for the index template.
curl \
--request GET http://api.example.com/_index_template
Get index settings
Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.
Path parameters
-
Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
.
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts with foo but no index starts withbar
. -
expand_wildcards string | array[string]
Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. -
flat_settings boolean
If
true
, returns settings in flat format. -
include_defaults boolean
If
true
, return all default settings in the response. -
local boolean
If
true
, the request retrieves information from the local node only. Iffalse
, information is retrieved from the master node. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request GET http://api.example.com/{index}/_settings
Update index settings
Changes dynamic index settings in real time. For data streams, index setting changes are applied to all backing indices by default.
To revert a setting to the default value, use a null value.
The list of per-index settings that can be updated dynamically on live indices can be found in index module documentation.
To preserve existing settings from being updated, set the preserve_existing
parameter to true
.
NOTE: You can only define new analyzers on closed indices. To add an analyzer, you must close the index, define the analyzer, and reopen the index. You cannot close the write index of a data stream. To update the analyzer for a data stream's write index and future backing indices, update the analyzer in the index template used by the stream. Then roll over the data stream to apply the new analyzer to the stream's write index and future backing indices. This affects searches and any new data added to the stream after the rollover. However, it does not affect the data stream's backing indices or their existing data. To change the analyzer for existing backing indices, you must create a new data stream and reindex your data into it.
Path parameters
-
Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
.
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts withfoo
but no index starts withbar
. -
expand_wildcards string | array[string]
Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. -
flat_settings boolean
If
true
, returns settings in flat format. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
preserve_existing boolean
If
true
, existing index settings remain unchanged. -
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Body Required
-
Additional properties are allowed.
-
mode string
routing_path string | array[string]
-
soft_deletes object
Additional properties are allowed.
-
sort object
Additional properties are allowed.
number_of_shards number | string
number_of_replicas number | string
-
number_of_routing_shards number
-
check_on_startup string
Values are
true
,false
, orchecksum
. -
codec string
routing_partition_size number | string
Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.
Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.
-
auto_expand_replicas string
-
merge object
Additional properties are allowed.
-
search object
Additional properties are allowed.
-
refresh_interval string
A duration. Units can be
nanos
,micros
,ms
(milliseconds),s
(seconds),m
(minutes),h
(hours) andd
(days). Also accepts "0" without a unit and "-1" to indicate an unspecified value. -
max_result_window number
-
max_inner_result_window number
-
max_rescore_window number
-
max_docvalue_fields_search number
-
max_script_fields number
-
max_ngram_diff number
-
max_shingle_diff number
-
blocks object
Additional properties are allowed.
-
max_refresh_listeners number
-
analyze object
Additional properties are allowed.
-
highlight object
Additional properties are allowed.
-
max_terms_count number
-
max_regex_length number
-
routing object
Additional properties are allowed.
-
gc_deletes string
A duration. Units can be
nanos
,micros
,ms
(milliseconds),s
(seconds),m
(minutes),h
(hours) andd
(days). Also accepts "0" without a unit and "-1" to indicate an unspecified value. -
default_pipeline string
-
final_pipeline string
-
lifecycle object
Additional properties are allowed.
-
provided_name string
creation_date number | string
Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.
Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.
One of: Time unit for milliseconds
creation_date_string string | number
A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.
-
uuid string
-
version object
Additional properties are allowed.
verified_before_close boolean | string
format string | number
-
max_slices_per_scroll number
-
translog object
Additional properties are allowed.
-
query_string object
Additional properties are allowed.
priority number | string
-
top_metrics_max_size number
-
analysis object
Additional properties are allowed.
-
Additional properties are allowed.
-
time_series object
Additional properties are allowed.
-
queries object
Additional properties are allowed.
-
similarity object
Configure custom similarity settings to customize how search results are scored.
-
mapping object
Additional properties are allowed.
-
indexing.slowlog object
Additional properties are allowed.
-
indexing_pressure object
Additional properties are allowed.
-
store object
Additional properties are allowed.
curl \
--request PUT http://api.example.com/{index}/_settings \
--header "Content-Type: application/json" \
--data '"{\n \"index\" : {\n \"number_of_replicas\" : 2\n }\n}"'
{
"index" : {
"number_of_replicas" : 2
}
}
Get index settings
Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.
Path parameters
-
Comma-separated list or wildcard expression of settings to retrieve.
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts with foo but no index starts withbar
. -
expand_wildcards string | array[string]
Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. -
flat_settings boolean
If
true
, returns settings in flat format. -
include_defaults boolean
If
true
, return all default settings in the response. -
local boolean
If
true
, the request retrieves information from the local node only. Iffalse
, information is retrieved from the master node. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request GET http://api.example.com/_settings/{name}
Roll over to a new index Added in 5.0.0
TIP: It is recommended to use the index lifecycle rollover action to automate rollovers.
The rollover API creates a new index for a data stream or index alias. The API behavior depends on the rollover target.
Roll over a data stream
If you roll over a data stream, the API creates a new write index for the stream. The stream's previous write index becomes a regular backing index. A rollover also increments the data stream's generation.
Roll over an index alias with a write index
TIP: Prior to Elasticsearch 7.9, you'd typically use an index alias with a write index to manage time series data. Data streams replace this functionality, require less maintenance, and automatically integrate with data tiers.
If an index alias points to multiple indices, one of the indices must be a write index.
The rollover API creates a new write index for the alias with is_write_index
set to true
.
The API also sets is_write_index
to false
for the previous write index.
Roll over an index alias with one index
If you roll over an index alias that points to only one index, the API creates a new index for the alias and removes the original index from the alias.
NOTE: A rollover creates a new index and is subject to the wait_for_active_shards
setting.
Increment index names for an alias
When you roll over an index alias, you can specify a name for the new index.
If you don't specify a name and the current index ends with -
and a number, such as my-index-000001
or my-index-3
, the new index name increments that number.
For example, if you roll over an alias with a current index of my-index-000001
, the rollover creates a new index named my-index-000002
.
This number is always six characters and zero-padded, regardless of the previous index's name.
If you use an index alias for time series data, you can use date math in the index name to track the rollover date.
For example, you can create an alias that points to an index named <my-index-{now/d}-000001>
.
If you create the index on May 6, 2099, the index's name is my-index-2099.05.06-000001
.
If you roll over the alias on May 7, 2099, the new index's name is my-index-2099.05.07-000002
.
Path parameters
-
Name of the data stream or index alias to roll over.
Query parameters
-
dry_run boolean
If
true
, checks whether the current index satisfies the specified conditions but does not perform a rollover. -
master_timeout string
Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout string
Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
-
wait_for_active_shards number | string
The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (
number_of_replicas+1
).
Body
-
aliases object
Aliases for the target index. Data streams do not support this parameter.
-
conditions object
Additional properties are allowed.
-
mappings object
Additional properties are allowed.
-
settings object
Configuration options for the index. Data streams do not support this parameter.
curl \
--request POST http://api.example.com/{alias}/_rollover \
--header "Content-Type: application/json" \
--data '"{\n \"conditions\": {\n \"max_age\": \"7d\",\n \"max_docs\": 1000,\n \"max_primary_shard_size\": \"50gb\",\n \"max_primary_shard_docs\": \"2000\"\n }\n}"'
{
"conditions": {
"max_age": "7d",
"max_docs": 1000,
"max_primary_shard_size": "50gb",
"max_primary_shard_docs": "2000"
}
}
{
"_shards": {},
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"node": "zDC_RorJQCao9xf9pg3Fvw",
"state": "STARTED",
"primary": true
},
"segments": {
"_0": {
"search": true,
"version": "7.0.0",
"compound": true,
"num_docs": 1,
"committed": false,
"attributes": {},
"generation": 0,
"deleted_docs": 0,
"size_in_bytes": 3800
}
},
"num_search_segments": 1,
"num_committed_segments": 0
}
]
}
}
}
}
Query parameters
-
allow_no_indices boolean
If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. -
all_shards boolean
If
true
, the validation is executed on all shards instead of one random shard per index. -
analyzer string
Analyzer to use for the query string. This parameter can only be used when the
q
query string parameter is specified. -
analyze_wildcard boolean
If
true
, wildcard and prefix queries are analyzed. -
default_operator string
The default operator for query string query:
AND
orOR
.Values are
and
,AND
,or
, orOR
. -
df string
Field to use as default where no field prefix is given in the query string. This parameter can only be used when the
q
query string parameter is specified. -
expand_wildcards string | array[string]
Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. Valid values are:all
,open
,closed
,hidden
,none
. -
explain boolean
If
true
, the response returns detailed information if an error has occurred. -
lenient boolean
If
true
, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. -
rewrite boolean
If
true
, returns a more detailed explanation showing the actual Lucene query that will be executed. -
q string
Query in the Lucene query string syntax.
curl \
--request POST http://api.example.com/_validate/query \
--header "Content-Type: application/json" \
--data '{"query":{}}'
Perform inference on the service using the Unified Schema Added in 8.18.0
Path parameters
-
The task type
Values are
sparse_embedding
,text_embedding
,rerank
, orcompletion
. -
The inference Id
Query parameters
-
timeout string
Specifies the amount of time to wait for the inference request to complete.
Body
-
A list of objects representing the conversation.
-
model string
The ID of the model to use.
-
max_completion_tokens number
The upper bound limit for the number of tokens that can be generated for a completion request.
-
stop array[string]
A sequence of strings to control when the model should stop generating additional tokens.
-
temperature number
The sampling temperature to use.
-
tools array[object]
A list of tools that the model can call.
-
top_p number
Nucleus sampling, an alternative to sampling with temperature.
curl \
--request POST http://api.example.com/_inference/{task_type}/{inference_id}/_unified \
--header "Content-Type: application/json" \
--data '{"messages":[{"":"string","role":"string","tool_call_id":"string","tool_calls":[{"id":"string","function":{"arguments":"string","name":"string"},"type":"string"}]}],"model":"string","max_completion_tokens":42.0,"stop":["string"],"temperature":42.0,"":"string","tools":[{"type":"string","function":{"description":"string","name":"string","parameters":{},"strict":true}}],"top_p":42.0}'
Ingest
Ingest APIs enable you to manage tasks and resources related to ingest pipelines and processors.