Get shard recovery information

GET /_cat/recovery

Get information about ongoing and completed shard recoveries. Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or syncing a replica shard from a primary shard. When a shard recovery completes, the recovered shard is available for search and indexing. For data streams, the API returns information about the stream’s backing indices. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the index recovery API.

Query parameters

  • If true, the response only includes ongoing shard recoveries.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • detailed boolean

    If true, the response includes detailed information about shard recoveries.

  • index string | array[string]

    Comma-separated list or wildcard expression of index names to limit the returned information

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/recovery
curl \
 --request GET 'http://api.example.com/_cat/recovery' \
 --header "Authorization: $API_KEY"
A successful response from `GET _cat/recovery?v=true&format=json`. In this example, the source and target nodes are the same because the recovery type is `store`, meaning they were read from local storage on node start.
[
  {
    "index": "my-index-000001 ",
    "shard": "0",
    "time": "13ms",
    "type": "store",
    "stage": "done",
    "source_host": "n/a",
    "source_node": "n/a",
    "target_host": "127.0.0.1",
    "target_node": "node-0",
    "repository": "n/a",
    "snapshot": "n/a",
    "files": "0",
    "files_recovered": "0",
    "files_percent": "100.0%",
    "files_total": "13",
    "bytes": "0b",
    "bytes_recovered": "0b",
    "bytes_percent": "100.0%",
    "bytes_total": "9928b",
    "translog_ops": "0",
    "translog_ops_recovered": "0",
    "translog_ops_percent": "100.0%"
  }
]
A successful response from `GET _cat/recovery?v=true&h=i,s,t,ty,st,shost,thost,f,fp,b,bp&format=json`. You can retrieve information about an ongoing recovery for example when you increase the replica count of an index and bring another node online to host the replicas. In this example, the recovery type is `peer`, meaning the shard recovered from another node. The `files` and `bytes` are real-time measurements.
[
  {
    "i": "my-index-000001",
    "s": "0",
    "t": "1252ms",
    "ty": "peer",
    "st": "done",
    "shost": "192.168.1.1",
    "thost": "192.168.1.1",
    "f": "0",
    "fp": "100.0%",
    "b": "0b",
    "bp": "100.0%",
  }
]
A successful response from `GET _cat/recovery?v=true&h=i,s,t,ty,st,rep,snap,f,fp,b,bp&format=json`. You can restore backups of an index using the snapshot and restore API. You can use the cat recovery API to get information about a snapshot recovery.
[
  {
    "i": "my-index-000001",
    "s": "0",
    "t": "1978ms",
    "ty": "snapshot",
    "st": "done",
    "rep": "my-repo",
    "snap": "snap-1",
    "f": "79",
    "fp": "8.0%",
    "b": "12086",
    "bp": "9.0%"
  }
]





















































































































































Get node information Added in 1.3.0

GET /_nodes

By default, the API returns all attributes and core settings for cluster nodes.

Query parameters

  • If true, returns settings in flat format.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_nodes
curl \
 --request GET 'http://api.example.com/_nodes' \
 --header "Authorization: $API_KEY"
Response examples (200)
An abbreviated response when requesting cluster nodes information.
{
    "_nodes": {},
    "cluster_name": "elasticsearch",
    "nodes": {
      "USpTGYaBSIKbgSUJR2Z9lg": {
        "name": "node-0",
        "transport_address": "192.168.17:9300",
        "host": "node-0.elastic.co",
        "ip": "192.168.17",
        "version": "{version}",
        "transport_version": 100000298,
        "index_version": 100000074,
        "component_versions": {
          "ml_config_version": 100000162,
          "transform_config_version": 100000096
        },
        "build_flavor": "default",
        "build_type": "{build_type}",
        "build_hash": "587409e",
        "roles": [
          "master",
          "data",
          "ingest"
        ],
        "attributes": {},
        "plugins": [
          {
            "name": "analysis-icu",
            "version": "{version}",
            "description": "The ICU Analysis plugin integrates Lucene ICU
  module into elasticsearch, adding ICU relates analysis components.",
            "classname":
  "org.elasticsearch.plugin.analysis.icu.AnalysisICUPlugin",
            "has_native_controller": false
          }
        ],
        "modules": [
          {
            "name": "lang-painless",
            "version": "{version}",
            "description": "An easy, safe and fast scripting language for
  Elasticsearch",
            "classname": "org.elasticsearch.painless.PainlessPlugin",
            "has_native_controller": false
          }
        ]
      }
    }
}




































Get node statistics

GET /_nodes/stats/{metric}/{index_metric}

Get statistics for nodes in a cluster. By default, all stats are returned. You can limit the returned information by using metrics.

Path parameters

  • metric string | array[string] Required

    Limit the information returned to the specified metrics

  • index_metric string | array[string] Required

    Limit the information returned for indices metric to the specific index metrics. It can be used only if indices (or all) metric is specified.

Query parameters

  • completion_fields string | array[string]

    Comma-separated list or wildcard expressions of fields to include in fielddata and suggest statistics.

  • fielddata_fields string | array[string]

    Comma-separated list or wildcard expressions of fields to include in fielddata statistics.

  • fields string | array[string]

    Comma-separated list or wildcard expressions of fields to include in the statistics.

  • groups boolean

    Comma-separated list of search groups to include in the search statistics.

  • If true, the call reports the aggregated disk usage of each one of the Lucene index files (only applies if segment stats are requested).

  • level string

    Indicates whether statistics are aggregated at the cluster, index, or shard level.

    Values are cluster, indices, or shards.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • types array[string]

    A comma-separated list of document types for the indexing index metric.

  • If true, the response includes information from segments that are not loaded into memory.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _nodes object
      Hide _nodes attributes Show _nodes attributes object
      • failures array[object]
        Hide failures attributes Show failures attributes object
      • total number Required

        Total number of nodes selected by the request.

      • successful number Required

        Number of nodes that responded successfully to the request.

      • failed number Required

        Number of nodes that rejected the request or failed to respond. If this value is not 0, a reason for the rejection or failure is included in the response.

    • nodes object Required
      Hide nodes attribute Show nodes attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • Statistics about adaptive replica selection.

          Hide adaptive_selection attribute Show adaptive_selection attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • The exponentially weighted moving average queue size of search requests on the keyed node.

            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • The exponentially weighted moving average response time, in nanoseconds, of search requests on the keyed node.

            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • The exponentially weighted moving average service time, in nanoseconds, of search requests on the keyed node.

            • The number of outstanding search requests to the keyed node from the node these stats are for.

            • rank string

              The rank of this node; used for shard selection when routing search requests.

        • breakers object

          Statistics about the field data circuit breaker.

          Hide breakers attribute Show breakers attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • Estimated memory used for the operation.

            • Estimated memory used, in bytes, for the operation.

            • Memory limit for the circuit breaker.

            • Memory limit, in bytes, for the circuit breaker.

            • overhead number

              A constant that all estimates for the circuit breaker are multiplied with to calculate a final estimate.

            • tripped number

              Total number of times the circuit breaker has been triggered and prevented an out of memory error.

        • fs object
          Hide fs attributes Show fs attributes object
          • data array[object]

            List of all file stores.

          • Last time the file stores statistics were refreshed. Recorded in milliseconds since the Unix Epoch.

          • total object
            Hide total attributes Show total attributes object
            • Total disk space available to this Java virtual machine on all file stores. Depending on OS or process level restrictions, this might appear less than free. This is the actual amount of free disk space the Elasticsearch node can utilise.

            • Total number of bytes available to this Java virtual machine on all file stores. Depending on OS or process level restrictions, this might appear less than free_in_bytes. This is the actual amount of free disk space the Elasticsearch node can utilise.

            • free string

              Total unallocated disk space in all file stores.

            • Total number of unallocated bytes in all file stores.

            • total string

              Total size of all file stores.

            • Total size of all file stores in bytes.

          • io_stats object
            Hide io_stats attributes Show io_stats attributes object
            • devices array[object]

              Array of disk metrics for each device that is backing an Elasticsearch data path. These disk metrics are probed periodically and averages between the last probe and the current probe are computed.

            • total object
        • host string
        • http object
          Hide http attributes Show http attributes object
          • Current number of open HTTP connections for the node.

          • Total number of HTTP connections opened for the node.

          • clients array[object]

            Information on current and recently-closed HTTP client connections. Clients that have been closed longer than the http.client_stats.closed_channels.max_age setting will not be represented here.

          • routes object Required Added in 8.12.0

            Detailed HTTP stats broken down by route

            Hide routes attribute Show routes attribute object
            • * object Additional properties
        • ingest object
          Hide ingest attributes Show ingest attributes object
          • Contains statistics about ingest pipelines for the node.

            Hide pipelines attribute Show pipelines attribute object
            • * object Additional properties
          • total object
            Hide total attributes Show total attributes object
            • count number Required

              Total number of documents ingested during the lifetime of this node.

            • current number Required

              Total number of documents currently being ingested.

            • failed number Required

              Total number of failed ingest operations during the lifetime of this node.

        • ip string | array[string]

          IP address and port for the node.

        • jvm object
          Hide jvm attributes Show jvm attributes object
          • Contains statistics about JVM buffer pools for the node.

            Hide buffer_pools attribute Show buffer_pools attribute object
            • * object Additional properties
          • classes object
            Hide classes attributes Show classes attributes object
          • gc object
            Hide gc attribute Show gc attribute object
            • Contains statistics about JVM garbage collectors for the node.

          • mem object
            Hide mem attributes Show mem attributes object
          • threads object
            Hide threads attributes Show threads attributes object
            • count number

              Number of active threads in use by JVM.

            • Highest number of threads used by JVM.

          • Last time JVM statistics were refreshed.

          • uptime string

            Human-readable JVM uptime. Only returned if the human query parameter is true.

          • JVM uptime in milliseconds.

        • name string
        • os object
          Hide os attributes Show os attributes object
          • cpu object
            Hide cpu attributes Show cpu attributes object
            • percent number
            • sys string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • total string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • user string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • swap object
            Hide swap attributes Show swap attributes object
          • cgroup object
            Hide cgroup attributes Show cgroup attributes object
        • process object
          Hide process attributes Show process attributes object
          • cpu object
            Hide cpu attributes Show cpu attributes object
            • percent number
            • sys string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • total string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • user string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • mem object
            Hide mem attributes Show mem attributes object
          • Number of opened file descriptors associated with the current or -1 if not supported.

          • Maximum number of file descriptors allowed on the system, or -1 if not supported.

          • Last time the statistics were refreshed. Recorded in milliseconds since the Unix Epoch.

        • roles array[string]
          • @doc_id node-roles

          Values are master, data, data_cold, data_content, data_frozen, data_hot, data_warm, client, ingest, ml, voting_only, transform, remote_cluster_client, or coordinating_only.

        • script object
          Hide script attributes Show script attributes object
          • Total number of times the script cache has evicted old data.

          • Total number of inline script compilations performed by the node.

          • Contains this recent history of script compilations.

            Hide compilations_history attribute Show compilations_history attribute object
            • * number Additional properties
          • Total number of times the script compilation circuit breaker has limited inline script compilations.

          • contexts array[object]
        • Statistics about each thread pool, including current size, queue and rejected tasks.

          Hide thread_pool attribute Show thread_pool attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • active number

              Number of active threads in the thread pool.

            • Number of tasks completed by the thread pool executor.

            • largest number

              Highest number of active threads in the thread pool.

            • queue number

              Number of tasks in queue for the thread pool.

            • rejected number

              Number of tasks rejected by the thread pool executor.

            • threads number

              Number of threads in the thread pool.

        • Hide transport attributes Show transport attributes object
          • The distribution of the time spent handling each inbound message on a transport thread, represented as a histogram.

          • The distribution of the time spent sending each outbound transport message on a transport thread, represented as a histogram.

          • rx_count number

            Total number of RX (receive) packets received by the node during internal cluster communication.

          • rx_size string

            Size of RX packets received by the node during internal cluster communication.

          • Size, in bytes, of RX packets received by the node during internal cluster communication.

          • Current number of inbound TCP connections used for internal communication between nodes.

          • tx_count number

            Total number of TX (transmit) packets sent by the node during internal cluster communication.

          • tx_size string

            Size of TX packets sent by the node during internal cluster communication.

          • Size, in bytes, of TX packets sent by the node during internal cluster communication.

          • The cumulative number of outbound transport connections that this node has opened since it started. Each transport connection may comprise multiple TCP connections but is only counted once in this statistic. Transport connections are typically long-lived so this statistic should remain constant in a stable cluster.

        • Contains a list of attributes for the node.

          Hide attributes attribute Show attributes attribute object
          • * string Additional properties
        • Hide discovery attributes Show discovery attributes object
          • Hide cluster_state_queue attributes Show cluster_state_queue attributes object
            • total number

              Total number of cluster states in queue.

            • pending number

              Number of pending cluster states in queue.

            • Number of committed cluster states in queue.

          • Hide published_cluster_states attributes Show published_cluster_states attributes object
          • Contains low-level statistics about how long various activities took during cluster state updates while the node was the elected master. Omitted if the node is not master-eligible. Every field whose name ends in _time within this object is also represented as a raw number of milliseconds in a field whose name ends in _time_millis. The human-readable fields with a _time suffix are only returned if requested with the ?human=true query parameter.

            Hide cluster_state_update attribute Show cluster_state_update attribute object
            • * object Additional properties
          • Hide serialized_cluster_states attributes Show serialized_cluster_states attributes object
          • Hide cluster_applier_stats attribute Show cluster_applier_stats attribute object
        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object
            Hide memory attributes Show memory attributes object
        • indices object
          Hide indices attributes Show indices attributes object
GET /_nodes/stats/{metric}/{index_metric}
curl \
 --request GET 'http://api.example.com/_nodes/stats/{metric}/{index_metric}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "_nodes": {
    "failures": [
      {
        "type": "string",
        "reason": "string",
        "stack_trace": "string",
        "caused_by": {},
        "root_cause": [
          {}
        ],
        "suppressed": [
          {}
        ]
      }
    ],
    "total": 42.0,
    "successful": 42.0,
    "failed": 42.0
  },
  "cluster_name": "string",
  "nodes": {
    "additionalProperty1": {
      "adaptive_selection": {
        "additionalProperty1": {
          "avg_queue_size": 42.0,
          "avg_response_time": "string",
          "avg_response_time_ns": 42.0,
          "avg_service_time": "string",
          "avg_service_time_ns": 42.0,
          "outgoing_searches": 42.0,
          "rank": "string"
        },
        "additionalProperty2": {
          "avg_queue_size": 42.0,
          "avg_response_time": "string",
          "avg_response_time_ns": 42.0,
          "avg_service_time": "string",
          "avg_service_time_ns": 42.0,
          "outgoing_searches": 42.0,
          "rank": "string"
        }
      },
      "breakers": {
        "additionalProperty1": {
          "estimated_size": "string",
          "estimated_size_in_bytes": 42.0,
          "limit_size": "string",
          "limit_size_in_bytes": 42.0,
          "overhead": 42.0,
          "tripped": 42.0
        },
        "additionalProperty2": {
          "estimated_size": "string",
          "estimated_size_in_bytes": 42.0,
          "limit_size": "string",
          "limit_size_in_bytes": 42.0,
          "overhead": 42.0,
          "tripped": 42.0
        }
      },
      "fs": {
        "data": [
          {}
        ],
        "timestamp": 42.0,
        "total": {
          "available": "string",
          "available_in_bytes": 42.0,
          "free": "string",
          "free_in_bytes": 42.0,
          "total": "string",
          "total_in_bytes": 42.0
        },
        "io_stats": {
          "devices": [
            {}
          ],
          "total": {}
        }
      },
      "host": "string",
      "http": {
        "current_open": 42.0,
        "total_opened": 42.0,
        "clients": [
          {}
        ],
        "routes": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        }
      },
      "ingest": {
        "pipelines": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "total": {
          "count": 42.0,
          "current": 42.0,
          "failed": 42.0
        }
      },
      "ip": "string",
      "jvm": {
        "buffer_pools": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "classes": {
          "current_loaded_count": 42.0,
          "total_loaded_count": 42.0,
          "total_unloaded_count": 42.0
        },
        "gc": {
          "collectors": {}
        },
        "mem": {
          "heap_used_in_bytes": 42.0,
          "heap_used_percent": 42.0,
          "heap_committed_in_bytes": 42.0,
          "heap_max_in_bytes": 42.0,
          "non_heap_used_in_bytes": 42.0,
          "non_heap_committed_in_bytes": 42.0,
          "pools": {}
        },
        "threads": {
          "count": 42.0,
          "peak_count": 42.0
        },
        "timestamp": 42.0,
        "uptime": "string",
        "uptime_in_millis": 42.0
      },
      "name": "string",
      "os": {
        "cpu": {
          "percent": 42.0,
          "sys": "string",
          "total": "string",
          "user": "string",
          "load_average": {}
        },
        "": {},
        "swap": {
          "adjusted_total_in_bytes": 42.0,
          "resident": "string",
          "resident_in_bytes": 42.0,
          "share": "string",
          "share_in_bytes": 42.0,
          "total_virtual": "string",
          "total_virtual_in_bytes": 42.0,
          "total_in_bytes": 42.0,
          "free_in_bytes": 42.0,
          "used_in_bytes": 42.0
        },
        "cgroup": {
          "cpuacct": {},
          "cpu": {},
          "memory": {}
        },
        "timestamp": 42.0
      },
      "process": {
        "cpu": {
          "percent": 42.0,
          "sys": "string",
          "total": "string",
          "user": "string",
          "load_average": {}
        },
        "mem": {
          "adjusted_total_in_bytes": 42.0,
          "resident": "string",
          "resident_in_bytes": 42.0,
          "share": "string",
          "share_in_bytes": 42.0,
          "total_virtual": "string",
          "total_virtual_in_bytes": 42.0,
          "total_in_bytes": 42.0,
          "free_in_bytes": 42.0,
          "used_in_bytes": 42.0
        },
        "open_file_descriptors": 42.0,
        "max_file_descriptors": 42.0,
        "timestamp": 42.0
      },
      "roles": [
        "master"
      ],
      "script": {
        "cache_evictions": 42.0,
        "compilations": 42.0,
        "compilations_history": {
          "additionalProperty1": 42.0,
          "additionalProperty2": 42.0
        },
        "compilation_limit_triggered": 42.0,
        "contexts": [
          {}
        ]
      },
      "script_cache": {},
      "thread_pool": {
        "additionalProperty1": {
          "active": 42.0,
          "completed": 42.0,
          "largest": 42.0,
          "queue": 42.0,
          "rejected": 42.0,
          "threads": 42.0
        },
        "additionalProperty2": {
          "active": 42.0,
          "completed": 42.0,
          "largest": 42.0,
          "queue": 42.0,
          "rejected": 42.0,
          "threads": 42.0
        }
      },
      "timestamp": 42.0,
      "transport": {
        "inbound_handling_time_histogram": [
          {}
        ],
        "outbound_handling_time_histogram": [
          {}
        ],
        "rx_count": 42.0,
        "rx_size": "string",
        "rx_size_in_bytes": 42.0,
        "server_open": 42.0,
        "tx_count": 42.0,
        "tx_size": "string",
        "tx_size_in_bytes": 42.0,
        "total_outbound_connections": 42.0
      },
      "transport_address": "string",
      "attributes": {
        "additionalProperty1": "string",
        "additionalProperty2": "string"
      },
      "discovery": {
        "cluster_state_queue": {
          "total": 42.0,
          "pending": 42.0,
          "committed": 42.0
        },
        "published_cluster_states": {
          "full_states": 42.0,
          "incompatible_diffs": 42.0,
          "compatible_diffs": 42.0
        },
        "cluster_state_update": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "serialized_cluster_states": {
          "full_states": {},
          "diffs": {}
        },
        "cluster_applier_stats": {
          "recordings": [
            {}
          ]
        }
      },
      "indexing_pressure": {
        "memory": {
          "limit_in_bytes": 42.0,
          "current": {},
          "total": {}
        }
      },
      "indices": {
        "commit": {
          "generation": 42.0,
          "id": "string",
          "num_docs": 42.0,
          "user_data": {}
        },
        "completion": {
          "size_in_bytes": 42.0,
          "fields": {}
        },
        "docs": {
          "count": 42.0,
          "deleted": 42.0
        },
        "fielddata": {
          "evictions": 42.0,
          "memory_size_in_bytes": 42.0,
          "fields": {}
        },
        "flush": {
          "periodic": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "get": {
          "current": 42.0,
          "exists_time": "string",
          "exists_total": 42.0,
          "missing_time": "string",
          "missing_total": 42.0,
          "time": "string",
          "total": 42.0
        },
        "indexing": {
          "index_current": 42.0,
          "delete_current": 42.0,
          "delete_time": "string",
          "delete_total": 42.0,
          "is_throttled": true,
          "noop_update_total": 42.0,
          "throttle_time": "string",
          "index_time": "string",
          "index_total": 42.0,
          "index_failed": 42.0,
          "types": {},
          "write_load": 42.0,
          "recent_write_load": 42.0,
          "peak_write_load": 42.0
        },
        "mappings": {
          "total_count": 42.0,
          "total_estimated_overhead_in_bytes": 42.0
        },
        "merges": {
          "current": 42.0,
          "current_docs": 42.0,
          "current_size": "string",
          "current_size_in_bytes": 42.0,
          "total": 42.0,
          "total_auto_throttle": "string",
          "total_auto_throttle_in_bytes": 42.0,
          "total_docs": 42.0,
          "total_size": "string",
          "total_size_in_bytes": 42.0,
          "total_stopped_time": "string",
          "total_throttled_time": "string",
          "total_time": "string"
        },
        "shard_path": {
          "data_path": "string",
          "is_custom_data_path": true,
          "state_path": "string"
        },
        "query_cache": {
          "cache_count": 42.0,
          "cache_size": 42.0,
          "evictions": 42.0,
          "hit_count": 42.0,
          "memory_size_in_bytes": 42.0,
          "miss_count": 42.0,
          "total_count": 42.0
        },
        "recovery": {
          "current_as_source": 42.0,
          "current_as_target": 42.0,
          "throttle_time": "string"
        },
        "refresh": {
          "external_total": 42.0,
          "listeners": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "request_cache": {
          "evictions": 42.0,
          "hit_count": 42.0,
          "memory_size": "string",
          "memory_size_in_bytes": 42.0,
          "miss_count": 42.0
        },
        "retention_leases": {
          "primary_term": 42.0,
          "version": 42.0,
          "leases": [
            {}
          ]
        },
        "routing": {
          "node": "string",
          "primary": true,
          "state": "UNASSIGNED"
        },
        "search": {
          "fetch_current": 42.0,
          "fetch_time": "string",
          "fetch_total": 42.0,
          "open_contexts": 42.0,
          "query_current": 42.0,
          "query_time": "string",
          "query_total": 42.0,
          "scroll_current": 42.0,
          "scroll_time": "string",
          "scroll_total": 42.0,
          "suggest_current": 42.0,
          "suggest_time": "string",
          "suggest_total": 42.0,
          "groups": {}
        },
        "segments": {
          "count": 42.0,
          "doc_values_memory_in_bytes": 42.0,
          "file_sizes": {},
          "fixed_bit_set_memory_in_bytes": 42.0,
          "index_writer_max_memory_in_bytes": 42.0,
          "index_writer_memory_in_bytes": 42.0,
          "max_unsafe_auto_id_timestamp": 42.0,
          "memory_in_bytes": 42.0,
          "norms_memory_in_bytes": 42.0,
          "points_memory_in_bytes": 42.0,
          "stored_fields_memory_in_bytes": 42.0,
          "terms_memory_in_bytes": 42.0,
          "term_vectors_memory_in_bytes": 42.0,
          "version_map_memory_in_bytes": 42.0
        },
        "seq_no": {
          "global_checkpoint": 42.0,
          "local_checkpoint": 42.0,
          "max_seq_no": 42.0
        },
        "store": {
          "size_in_bytes": 42.0,
          "reserved_in_bytes": 42.0,
          "total_data_set_size_in_bytes": 42.0
        },
        "translog": {
          "earliest_last_modified_age": 42.0,
          "operations": 42.0,
          "size": "string",
          "size_in_bytes": 42.0,
          "uncommitted_operations": 42.0,
          "uncommitted_size": "string",
          "uncommitted_size_in_bytes": 42.0
        },
        "warmer": {
          "current": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "bulk": {
          "total_operations": 42.0,
          "total_time": "string",
          "total_size_in_bytes": 42.0,
          "avg_time": "string",
          "avg_size_in_bytes": 42.0
        },
        "shards": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "shard_stats": {
          "total_count": 42.0
        },
        "additionalProperty1": {
          "primaries": {},
          "shards": {},
          "total": {},
          "uuid": "string",
          "health": "green",
          "status": "open"
        },
        "additionalProperty2": {
          "primaries": {},
          "shards": {},
          "total": {},
          "uuid": "string",
          "health": "green",
          "status": "open"
        }
      }
    },
    "additionalProperty2": {
      "adaptive_selection": {
        "additionalProperty1": {
          "avg_queue_size": 42.0,
          "avg_response_time": "string",
          "avg_response_time_ns": 42.0,
          "avg_service_time": "string",
          "avg_service_time_ns": 42.0,
          "outgoing_searches": 42.0,
          "rank": "string"
        },
        "additionalProperty2": {
          "avg_queue_size": 42.0,
          "avg_response_time": "string",
          "avg_response_time_ns": 42.0,
          "avg_service_time": "string",
          "avg_service_time_ns": 42.0,
          "outgoing_searches": 42.0,
          "rank": "string"
        }
      },
      "breakers": {
        "additionalProperty1": {
          "estimated_size": "string",
          "estimated_size_in_bytes": 42.0,
          "limit_size": "string",
          "limit_size_in_bytes": 42.0,
          "overhead": 42.0,
          "tripped": 42.0
        },
        "additionalProperty2": {
          "estimated_size": "string",
          "estimated_size_in_bytes": 42.0,
          "limit_size": "string",
          "limit_size_in_bytes": 42.0,
          "overhead": 42.0,
          "tripped": 42.0
        }
      },
      "fs": {
        "data": [
          {}
        ],
        "timestamp": 42.0,
        "total": {
          "available": "string",
          "available_in_bytes": 42.0,
          "free": "string",
          "free_in_bytes": 42.0,
          "total": "string",
          "total_in_bytes": 42.0
        },
        "io_stats": {
          "devices": [
            {}
          ],
          "total": {}
        }
      },
      "host": "string",
      "http": {
        "current_open": 42.0,
        "total_opened": 42.0,
        "clients": [
          {}
        ],
        "routes": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        }
      },
      "ingest": {
        "pipelines": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "total": {
          "count": 42.0,
          "current": 42.0,
          "failed": 42.0
        }
      },
      "ip": "string",
      "jvm": {
        "buffer_pools": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "classes": {
          "current_loaded_count": 42.0,
          "total_loaded_count": 42.0,
          "total_unloaded_count": 42.0
        },
        "gc": {
          "collectors": {}
        },
        "mem": {
          "heap_used_in_bytes": 42.0,
          "heap_used_percent": 42.0,
          "heap_committed_in_bytes": 42.0,
          "heap_max_in_bytes": 42.0,
          "non_heap_used_in_bytes": 42.0,
          "non_heap_committed_in_bytes": 42.0,
          "pools": {}
        },
        "threads": {
          "count": 42.0,
          "peak_count": 42.0
        },
        "timestamp": 42.0,
        "uptime": "string",
        "uptime_in_millis": 42.0
      },
      "name": "string",
      "os": {
        "cpu": {
          "percent": 42.0,
          "sys": "string",
          "total": "string",
          "user": "string",
          "load_average": {}
        },
        "": {},
        "swap": {
          "adjusted_total_in_bytes": 42.0,
          "resident": "string",
          "resident_in_bytes": 42.0,
          "share": "string",
          "share_in_bytes": 42.0,
          "total_virtual": "string",
          "total_virtual_in_bytes": 42.0,
          "total_in_bytes": 42.0,
          "free_in_bytes": 42.0,
          "used_in_bytes": 42.0
        },
        "cgroup": {
          "cpuacct": {},
          "cpu": {},
          "memory": {}
        },
        "timestamp": 42.0
      },
      "process": {
        "cpu": {
          "percent": 42.0,
          "sys": "string",
          "total": "string",
          "user": "string",
          "load_average": {}
        },
        "mem": {
          "adjusted_total_in_bytes": 42.0,
          "resident": "string",
          "resident_in_bytes": 42.0,
          "share": "string",
          "share_in_bytes": 42.0,
          "total_virtual": "string",
          "total_virtual_in_bytes": 42.0,
          "total_in_bytes": 42.0,
          "free_in_bytes": 42.0,
          "used_in_bytes": 42.0
        },
        "open_file_descriptors": 42.0,
        "max_file_descriptors": 42.0,
        "timestamp": 42.0
      },
      "roles": [
        "master"
      ],
      "script": {
        "cache_evictions": 42.0,
        "compilations": 42.0,
        "compilations_history": {
          "additionalProperty1": 42.0,
          "additionalProperty2": 42.0
        },
        "compilation_limit_triggered": 42.0,
        "contexts": [
          {}
        ]
      },
      "script_cache": {},
      "thread_pool": {
        "additionalProperty1": {
          "active": 42.0,
          "completed": 42.0,
          "largest": 42.0,
          "queue": 42.0,
          "rejected": 42.0,
          "threads": 42.0
        },
        "additionalProperty2": {
          "active": 42.0,
          "completed": 42.0,
          "largest": 42.0,
          "queue": 42.0,
          "rejected": 42.0,
          "threads": 42.0
        }
      },
      "timestamp": 42.0,
      "transport": {
        "inbound_handling_time_histogram": [
          {}
        ],
        "outbound_handling_time_histogram": [
          {}
        ],
        "rx_count": 42.0,
        "rx_size": "string",
        "rx_size_in_bytes": 42.0,
        "server_open": 42.0,
        "tx_count": 42.0,
        "tx_size": "string",
        "tx_size_in_bytes": 42.0,
        "total_outbound_connections": 42.0
      },
      "transport_address": "string",
      "attributes": {
        "additionalProperty1": "string",
        "additionalProperty2": "string"
      },
      "discovery": {
        "cluster_state_queue": {
          "total": 42.0,
          "pending": 42.0,
          "committed": 42.0
        },
        "published_cluster_states": {
          "full_states": 42.0,
          "incompatible_diffs": 42.0,
          "compatible_diffs": 42.0
        },
        "cluster_state_update": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "serialized_cluster_states": {
          "full_states": {},
          "diffs": {}
        },
        "cluster_applier_stats": {
          "recordings": [
            {}
          ]
        }
      },
      "indexing_pressure": {
        "memory": {
          "limit_in_bytes": 42.0,
          "current": {},
          "total": {}
        }
      },
      "indices": {
        "commit": {
          "generation": 42.0,
          "id": "string",
          "num_docs": 42.0,
          "user_data": {}
        },
        "completion": {
          "size_in_bytes": 42.0,
          "fields": {}
        },
        "docs": {
          "count": 42.0,
          "deleted": 42.0
        },
        "fielddata": {
          "evictions": 42.0,
          "memory_size_in_bytes": 42.0,
          "fields": {}
        },
        "flush": {
          "periodic": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "get": {
          "current": 42.0,
          "exists_time": "string",
          "exists_total": 42.0,
          "missing_time": "string",
          "missing_total": 42.0,
          "time": "string",
          "total": 42.0
        },
        "indexing": {
          "index_current": 42.0,
          "delete_current": 42.0,
          "delete_time": "string",
          "delete_total": 42.0,
          "is_throttled": true,
          "noop_update_total": 42.0,
          "throttle_time": "string",
          "index_time": "string",
          "index_total": 42.0,
          "index_failed": 42.0,
          "types": {},
          "write_load": 42.0,
          "recent_write_load": 42.0,
          "peak_write_load": 42.0
        },
        "mappings": {
          "total_count": 42.0,
          "total_estimated_overhead_in_bytes": 42.0
        },
        "merges": {
          "current": 42.0,
          "current_docs": 42.0,
          "current_size": "string",
          "current_size_in_bytes": 42.0,
          "total": 42.0,
          "total_auto_throttle": "string",
          "total_auto_throttle_in_bytes": 42.0,
          "total_docs": 42.0,
          "total_size": "string",
          "total_size_in_bytes": 42.0,
          "total_stopped_time": "string",
          "total_throttled_time": "string",
          "total_time": "string"
        },
        "shard_path": {
          "data_path": "string",
          "is_custom_data_path": true,
          "state_path": "string"
        },
        "query_cache": {
          "cache_count": 42.0,
          "cache_size": 42.0,
          "evictions": 42.0,
          "hit_count": 42.0,
          "memory_size_in_bytes": 42.0,
          "miss_count": 42.0,
          "total_count": 42.0
        },
        "recovery": {
          "current_as_source": 42.0,
          "current_as_target": 42.0,
          "throttle_time": "string"
        },
        "refresh": {
          "external_total": 42.0,
          "listeners": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "request_cache": {
          "evictions": 42.0,
          "hit_count": 42.0,
          "memory_size": "string",
          "memory_size_in_bytes": 42.0,
          "miss_count": 42.0
        },
        "retention_leases": {
          "primary_term": 42.0,
          "version": 42.0,
          "leases": [
            {}
          ]
        },
        "routing": {
          "node": "string",
          "primary": true,
          "state": "UNASSIGNED"
        },
        "search": {
          "fetch_current": 42.0,
          "fetch_time": "string",
          "fetch_total": 42.0,
          "open_contexts": 42.0,
          "query_current": 42.0,
          "query_time": "string",
          "query_total": 42.0,
          "scroll_current": 42.0,
          "scroll_time": "string",
          "scroll_total": 42.0,
          "suggest_current": 42.0,
          "suggest_time": "string",
          "suggest_total": 42.0,
          "groups": {}
        },
        "segments": {
          "count": 42.0,
          "doc_values_memory_in_bytes": 42.0,
          "file_sizes": {},
          "fixed_bit_set_memory_in_bytes": 42.0,
          "index_writer_max_memory_in_bytes": 42.0,
          "index_writer_memory_in_bytes": 42.0,
          "max_unsafe_auto_id_timestamp": 42.0,
          "memory_in_bytes": 42.0,
          "norms_memory_in_bytes": 42.0,
          "points_memory_in_bytes": 42.0,
          "stored_fields_memory_in_bytes": 42.0,
          "terms_memory_in_bytes": 42.0,
          "term_vectors_memory_in_bytes": 42.0,
          "version_map_memory_in_bytes": 42.0
        },
        "seq_no": {
          "global_checkpoint": 42.0,
          "local_checkpoint": 42.0,
          "max_seq_no": 42.0
        },
        "store": {
          "size_in_bytes": 42.0,
          "reserved_in_bytes": 42.0,
          "total_data_set_size_in_bytes": 42.0
        },
        "translog": {
          "earliest_last_modified_age": 42.0,
          "operations": 42.0,
          "size": "string",
          "size_in_bytes": 42.0,
          "uncommitted_operations": 42.0,
          "uncommitted_size": "string",
          "uncommitted_size_in_bytes": 42.0
        },
        "warmer": {
          "current": 42.0,
          "total": 42.0,
          "total_time": "string"
        },
        "bulk": {
          "total_operations": 42.0,
          "total_time": "string",
          "total_size_in_bytes": 42.0,
          "avg_time": "string",
          "avg_size_in_bytes": 42.0
        },
        "shards": {
          "additionalProperty1": {},
          "additionalProperty2": {}
        },
        "shard_stats": {
          "total_count": 42.0
        },
        "additionalProperty1": {
          "primaries": {},
          "shards": {},
          "total": {},
          "uuid": "string",
          "health": "green",
          "status": "open"
        },
        "additionalProperty2": {
          "primaries": {},
          "shards": {},
          "total": {},
          "uuid": "string",
          "health": "green",
          "status": "open"
        }
      }
    }
  }
}





















































































































































































































































































Bulk index or delete documents

PUT /{index}/_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Path parameters

  • index string Required

    The name of the data stream, index, or index alias to perform bulk actions on.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

PUT /{index}/_bulk
curl \
 --request PUT 'http://api.example.com/{index}/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}



































































































































































































































































































































































































































































































Update field mappings

POST /{index}/_mapping

Add new fields to an existing data stream or index. You can also use this API to change the search settings of existing fields and add new properties to existing object fields. For data streams, these changes are applied to all backing indices by default.

Add multi-fields to an existing field

Multi-fields let you index the same field in different ways. You can use this API to update the fields mapping parameter and enable multi-fields for an existing field. WARNING: If an index (or data stream) contains documents when you add a multi-field, those documents will not have values for the new multi-field. You can populate the new multi-field with the update by query API.

Change supported mapping parameters for an existing field

The documentation for each mapping parameter indicates whether you can update it for an existing field using this API. For example, you can use the update mapping API to update the ignore_above parameter.

Change the mapping of an existing field

Except for supported mapping parameters, you can't change the mapping or field type of an existing field. Changing an existing field could invalidate data that's already indexed.

If you need to change the mapping of a field in a data stream's backing indices, refer to documentation about modifying data streams. If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.

Rename a field

Renaming a field would invalidate data already indexed under the old field name. Instead, add an alias field to create an alternate field name.

External documentation

Path parameters

  • index string | array[string] Required

    A comma-separated list of index names the mapping should be added to (supports wildcards); use _all or omit to add the mapping on all indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, the mappings are applied only to the current write index for the target.

application/json

Body Required

  • Controls whether dynamic date detection is enabled.

  • dynamic string

    Values are strict, runtime, true, or false.

  • If date detection is enabled then new string fields are checked against 'dynamic_date_formats' and if the value matches then a new date field is added instead of string.

  • dynamic_templates array[object]

    Specify dynamic templates for the mapping.

  • Hide _field_names attribute Show _field_names attribute object
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • Automatically map strings into numeric data types for all fields.

  • Mapping for a field. For new fields, this mapping can include:

    • Field name
    • Field data type
    • Mapping parameters
  • _routing object
    Hide _routing attribute Show _routing attribute object
  • _source object
    Hide _source attributes Show _source attributes object
  • runtime object
    Hide runtime attribute Show runtime attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /{index}/_mapping
curl \
 --request POST 'http://api.example.com/{index}/_mapping' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"properties\": {\n    \"user\": {\n      \"properties\": {\n        \"name\": {\n          \"type\": \"keyword\"\n        }\n      }\n    }\n  }\n}"'
Request example
The update mapping API can be applied to multiple data streams or indices with a single request. For example, run `PUT /my-index-000001,my-index-000002/_mapping` to update mappings for the `my-index-000001` and `my-index-000002` indices at the same time.
{
  "properties": {
    "user": {
      "properties": {
        "name": {
          "type": "keyword"
        }
      }
    }
  }
}
Response examples (200)
{
  "acknowledged": true,
  "_shards": {
    "failed": 42.0,
    "successful": 42.0,
    "total": 42.0,
    "failures": [
      {
        "index": "string",
        "node": "string",
        "reason": {
          "type": "string",
          "reason": "string",
          "stack_trace": "string",
          "caused_by": {},
          "root_cause": [
            {}
          ],
          "suppressed": [
            {}
          ]
        },
        "shard": 42.0,
        "status": "string"
      }
    ],
    "skipped": 42.0
  }
}
















































































































































Create or update an alias Added in 1.3.0

POST /_aliases

Adds a data stream or index to an alias.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_aliases
curl \
 --request POST 'http://api.example.com/_aliases' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"actions":[{"add":{"alias":"string","aliases":"string","filter":{},"index":"string","indices":"string","index_routing":"string","is_hidden":true,"is_write_index":true,"routing":"string","search_routing":"string","must_exist":true},"remove":{"alias":"string","aliases":"string","index":"string","indices":"string","must_exist":true},"remove_index":{"index":"string","indices":"string","must_exist":true}}]}'
Request examples
{
  "actions": [
    {
      "add": {
        "alias": "string",
        "aliases": "string",
        "filter": {},
        "index": "string",
        "indices": "string",
        "index_routing": "string",
        "is_hidden": true,
        "is_write_index": true,
        "routing": "string",
        "search_routing": "string",
        "must_exist": true
      },
      "remove": {
        "alias": "string",
        "aliases": "string",
        "index": "string",
        "indices": "string",
        "must_exist": true
      },
      "remove_index": {
        "index": "string",
        "indices": "string",
        "must_exist": true
      }
    }
  ]
}
Response examples (200)
{
  "acknowledged": true
}






























































































































Create an Azure OpenAI inference endpoint Added in 8.14.0

PUT /_inference/{task_type}/{azureopenai_inference_id}

Create an inference endpoint to perform an inference task with the azureopenai service.

The list of chat completion models that you can choose from in your Azure OpenAI deployment include:

The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is azureopenai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string

      A valid API key for your Azure OpenAI account. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • api_version string Required

      The Azure API version ID to use. It is recommended to use the latest supported non-preview version.

    • deployment_id string Required

      The deployment name of your deployed models. Your Azure OpenAI deployments can be found though the Azure OpenAI Studio portal that is linked to your subscription.

      External documentation
    • entra_id string

      A valid Microsoft Entra token. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • resource_name string Required

      The name of your Azure OpenAI resource. You can find this from the list of resources in the Azure Portal for your subscription.

      External documentation
  • Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{azureopenai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{azureopenai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"azureopenai\",\n    \"service_settings\": {\n        \"api_key\": \"Api-Key\",\n        \"resource_name\": \"Resource-name\",\n        \"deployment_id\": \"Deployment-id\",\n        \"api_version\": \"2024-02-01\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
Response examples (200)
{
  "chunking_settings": {
    "max_chunk_size": 42.0,
    "overlap": 42.0,
    "sentence_overlap": 42.0,
    "strategy": "string"
  },
  "service": "string",
  "service_settings": {},
  "task_settings": {},
  "inference_id": "string",
  "task_type": "sparse_embedding"
}
























































Perform text embedding inference on the service Added in 8.11.0

POST /_inference/text_embedding/{inference_id}

Path parameters

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

POST /_inference/text_embedding/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/text_embedding/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": \"The sky above the port was the color of television tuned to a dead channel.\",\n  \"task_settings\": {\n    \"input_type\": \"ingest\"\n  }\n}"'
Request example
Run `POST _inference/text_embedding/my-cohere-endpoint` to perform text embedding on the example sentence using the Cohere integration,
{
  "input": "The sky above the port was the color of television tuned to a dead channel.",
  "task_settings": {
    "input_type": "ingest"
  }
}
Response examples (200)
An abbreviated response from `POST _inference/text_embedding/my-cohere-endpoint`.
{
  "text_embedding": [
    {
      "embedding": [
        {
          0.018569946,
          -0.036895752,
          0.01486969,
          -0.0045204163,
          -0.04385376,
          0.0075950623,
          0.04260254,
          -0.004005432,
          0.007865906,
          0.030792236,
          -0.050476074,
          0.011795044,
          -0.011642456,
          -0.010070801
        }
      ]
    }
  ]
}

























































































































































Get machine learning memory usage info Added in 8.2.0

GET /_ml/memory/_stats

Get information about how machine learning jobs and trained models are using memory, on each node, both within the JVM heap, and natively, outside of the JVM.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_ml/memory/_stats
curl \
 --request GET 'http://api.example.com/_ml/memory/_stats' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "_nodes": {
    "failures": [
      {
        "type": "string",
        "reason": "string",
        "stack_trace": "string",
        "caused_by": {},
        "root_cause": [
          {}
        ],
        "suppressed": [
          {}
        ]
      }
    ],
    "total": 42.0,
    "successful": 42.0,
    "failed": 42.0
  },
  "cluster_name": "string",
  "nodes": {
    "additionalProperty1": {
      "attributes": {
        "additionalProperty1": "string",
        "additionalProperty2": "string"
      },
      "jvm": {
        "": 42.0,
        "heap_max_in_bytes": 42.0,
        "java_inference_in_bytes": 42.0,
        "java_inference_max_in_bytes": 42.0
      },
      "mem": {
        "": 42.0,
        "adjusted_total_in_bytes": 42.0,
        "total_in_bytes": 42.0,
        "ml": {
          "": 42.0,
          "anomaly_detectors_in_bytes": 42.0,
          "data_frame_analytics_in_bytes": 42.0,
          "max_in_bytes": 42.0,
          "native_code_overhead_in_bytes": 42.0,
          "native_inference_in_bytes": 42.0
        }
      },
      "name": "string",
      "roles": [
        "string"
      ],
      "transport_address": "string",
      "ephemeral_id": "string"
    },
    "additionalProperty2": {
      "attributes": {
        "additionalProperty1": "string",
        "additionalProperty2": "string"
      },
      "jvm": {
        "": 42.0,
        "heap_max_in_bytes": 42.0,
        "java_inference_in_bytes": 42.0,
        "java_inference_max_in_bytes": 42.0
      },
      "mem": {
        "": 42.0,
        "adjusted_total_in_bytes": 42.0,
        "total_in_bytes": 42.0,
        "ml": {
          "": 42.0,
          "anomaly_detectors_in_bytes": 42.0,
          "data_frame_analytics_in_bytes": 42.0,
          "max_in_bytes": 42.0,
          "native_code_overhead_in_bytes": 42.0,
          "native_inference_in_bytes": 42.0
        }
      },
      "name": "string",
      "roles": [
        "string"
      ],
      "transport_address": "string",
      "ephemeral_id": "string"
    }
  }
}








Set upgrade_mode for ML indices Added in 6.7.0

POST /_ml/set_upgrade_mode

Sets a cluster wide upgrade_mode setting that prepares machine learning indices for an upgrade. When upgrading your cluster, in some circumstances you must restart your nodes and reindex your machine learning indices. In those circumstances, there must be no machine learning jobs running. You can close the machine learning jobs, do the upgrade, then open all the jobs again. Alternatively, you can use this API to temporarily halt tasks associated with the jobs and datafeeds and prevent new jobs from opening. You can also use this API during upgrades that do not require you to reindex your machine learning indices, though stopping jobs is not a requirement in that case. You can see the current value for the upgrade_mode setting by using the get machine learning info API.

Query parameters

  • enabled boolean

    When true, it enables upgrade_mode which temporarily halts all job and datafeed tasks and prohibits new job and datafeed tasks from starting.

  • timeout string

    The time to wait for the request to be completed.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/set_upgrade_mode
curl \
 --request POST 'http://api.example.com/_ml/set_upgrade_mode' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "acknowledged": true
}





























Delete anomaly jobs from a calendar Added in 6.2.0

DELETE /_ml/calendars/{calendar_id}/jobs/{job_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

  • job_id string | array[string] Required

    An identifier for the anomaly detection jobs. It can be a job identifier, a group name, or a comma-separated list of jobs or groups.

Responses

DELETE /_ml/calendars/{calendar_id}/jobs/{job_id}
curl \
 --request DELETE 'http://api.example.com/_ml/calendars/{calendar_id}/jobs/{job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an anomaly detection job from a calendar.
{
  "calendar_id": "planned-outages",
  "job_ids": []
}




















































Delete an anomaly detection job Added in 5.4.0

DELETE /_ml/anomaly_detectors/{job_id}

All job configuration, model state and results are deleted. It is not currently possible to delete multiple jobs using wildcards or a comma separated list. If you delete a job that has a datafeed, the request first tries to delete the datafeed. This behavior is equivalent to calling the delete datafeed API with the same timeout and force parameters as the delete job request.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • force boolean

    Use to forcefully delete an opened job; this method is quicker than closing and deleting the job.

  • Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.

  • Specifies whether the request should return immediately or wait until the job deletion completes.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/anomaly_detectors/{job_id}
curl \
 --request DELETE 'http://api.example.com/_ml/anomaly_detectors/{job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an anomaly detection job.
{
  "acknowledged": true
}
A successful response when deleting an anomaly detection job asynchronously. When the `wait_for_completion` query parameter is set to `false`, the response contains an identifier for the job deletion task.
{
  "task": "oTUltX4IQMOUUVeiohTt8A:39"
}












































































Get datafeeds configuration info Added in 5.5.0

GET /_ml/datafeeds

You can get information for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get information for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. This API returns a maximum of 10,000 datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
          • fetch_fields array[object]

            For type lookup

          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
GET /_ml/datafeeds
curl \
 --request GET 'http://api.example.com/_ml/datafeeds' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "count": 42.0,
  "datafeeds": [
    {
      "aggregations": {},
      "authorization": {
        "api_key": {
          "id": "string",
          "name": "string"
        },
        "roles": [
          "string"
        ],
        "service_account": "string"
      },
      "chunking_config": {
        "mode": "auto",
        "time_span": "string"
      },
      "datafeed_id": "string",
      "frequency": "string",
      "indices": [
        "string"
      ],
      "indexes": [
        "string"
      ],
      "job_id": "string",
      "max_empty_searches": 42.0,
      "query_delay": "string",
      "script_fields": {
        "additionalProperty1": {
          "script": {
            "id": "string",
            "params": {},
            "options": {}
          },
          "ignore_failure": true
        },
        "additionalProperty2": {
          "script": {
            "id": "string",
            "params": {},
            "options": {}
          },
          "ignore_failure": true
        }
      },
      "scroll_size": 42.0,
      "delayed_data_check_config": {
        "check_window": "string",
        "enabled": true
      },
      "runtime_mappings": {
        "additionalProperty1": {
          "fields": {
            "additionalProperty1": {},
            "additionalProperty2": {}
          },
          "fetch_fields": [
            {}
          ],
          "format": "string",
          "input_field": "string",
          "target_field": "string",
          "target_index": "string",
          "script": {
            "id": "string",
            "params": {},
            "options": {}
          },
          "type": "boolean"
        },
        "additionalProperty2": {
          "fields": {
            "additionalProperty1": {},
            "additionalProperty2": {}
          },
          "fetch_fields": [
            {}
          ],
          "format": "string",
          "input_field": "string",
          "target_field": "string",
          "target_index": "string",
          "script": {
            "id": "string",
            "params": {},
            "options": {}
          },
          "type": "boolean"
        }
      },
      "indices_options": {
        "allow_no_indices": true,
        "expand_wildcards": "string",
        "ignore_unavailable": true,
        "ignore_throttled": true
      },
      "query": {}
    }
  ]
}





















































































































































Explain data frame analytics config Added in 7.3.0

POST /_ml/data_frame/analytics/{id}/_explain

This API provides explanations for a data frame analytics config that either exists already or one that has not been created yet. The following explanations are provided:

  • which fields are included or not in the analysis and why,
  • how much memory is estimated to be required. The estimate can be used when deciding the appropriate value for model_memory_limit setting later on. If you have object fields or fields that are excluded via source filtering, they are not included in the explanation.

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

application/json

Body

  • source object
    Hide source attributes Show source attributes object
    • index string | array[string] Required
    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • _source object
      Hide _source attributes Show _source attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

    • query object

      The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

      Query DSL
  • dest object
    Hide dest attributes Show dest attributes object
    • index string Required
    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • analysis object
    Hide analysis attributes Show analysis attributes object
    • Hide classification attributes Show classification attributes object
      • alpha number

        Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

      • dependent_variable string Required

        Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

      • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

      • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

      • eta number

        Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

      • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

      • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

      • feature_processors array[object]

        Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

        Hide feature_processors attributes Show feature_processors attributes object
        • Hide frequency_encoding attributes Show frequency_encoding attributes object
          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • frequency_map object Required

            The resulting frequency map for the field value. If the field value is missing from the frequency_map, the resulting value is 0.

        • Hide multi_encoding attribute Show multi_encoding attribute object
          • processors array[number] Required

            The ordered array of custom processors to execute. Must be more than 1.

        • Hide n_gram_encoding attributes Show n_gram_encoding attributes object
          • The feature name prefix. Defaults to ngram__.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • length number

            Specifies the length of the n-gram substring. Defaults to 50. Must be greater than 0.

          • n_grams array[number] Required

            Specifies which n-grams to gather. It’s an array of integer values where the minimum value is 1, and a maximum value is 5.

          • start number

            Specifies the zero-indexed start of the n-gram substring. Negative values are allowed for encoding n-grams of string suffixes. Defaults to 0.

          • custom boolean
        • Hide one_hot_encoding attributes Show one_hot_encoding attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • hot_map string Required

            The one hot map mapping the field value with the column name.

        • Hide target_mean_encoding attributes Show target_mean_encoding attributes object
          • default_value number Required

            The default value if field value is not found in the target_map.

          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_map object Required

            The field value to target mean transition map.

      • gamma number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • lambda number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

      • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

      • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

      • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

    • Hide outlier_detection attributes Show outlier_detection attributes object
      • Specifies whether the feature influence calculation is enabled.

      • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

      • method string

        The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

      • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

      • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

      • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

    • Hide regression attributes Show regression attributes object
      • alpha number

        Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

      • dependent_variable string Required

        Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

      • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

      • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

      • eta number

        Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

      • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

      • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

      • feature_processors array[object]

        Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

        Hide feature_processors attributes Show feature_processors attributes object
        • Hide frequency_encoding attributes Show frequency_encoding attributes object
          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • frequency_map object Required

            The resulting frequency map for the field value. If the field value is missing from the frequency_map, the resulting value is 0.

        • Hide multi_encoding attribute Show multi_encoding attribute object
          • processors array[number] Required

            The ordered array of custom processors to execute. Must be more than 1.

        • Hide n_gram_encoding attributes Show n_gram_encoding attributes object
          • The feature name prefix. Defaults to ngram__.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • length number

            Specifies the length of the n-gram substring. Defaults to 50. Must be greater than 0.

          • n_grams array[number] Required

            Specifies which n-grams to gather. It’s an array of integer values where the minimum value is 1, and a maximum value is 5.

          • start number

            Specifies the zero-indexed start of the n-gram substring. Negative values are allowed for encoding n-grams of string suffixes. Defaults to 0.

          • custom boolean
        • Hide one_hot_encoding attributes Show one_hot_encoding attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • hot_map string Required

            The one hot map mapping the field value with the column name.

        • Hide target_mean_encoding attributes Show target_mean_encoding attributes object
          • default_value number Required

            The default value if field value is not found in the target_map.

          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_map object Required

            The field value to target mean transition map.

      • gamma number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • lambda number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

      • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

      • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

      • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

      • A positive number that is used as a parameter to the loss_function.

  • A description of the job.

  • The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.

  • The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.

  • Hide analyzed_fields attributes Show analyzed_fields attributes object
    • includes array[string]

      An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

    • excludes array[string]

      An array of strings that defines the fields that will be included in the analysis.

  • Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • field_selection array[object] Required

      An array of objects that explain selection for each field, sorted by the field names.

      Hide field_selection attributes Show field_selection attributes object
      • is_included boolean Required

        Whether the field is selected to be included in the analysis.

      • is_required boolean Required

        Whether the field is required.

      • The feature type of this field for the analysis. May be categorical or numerical.

      • mapping_types array[string] Required

        The mapping types of the field.

      • name string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reason string

        The reason a field is not selected to be included in the analysis.

    • memory_estimation object Required
      Hide memory_estimation attributes Show memory_estimation attributes object
      • Estimated memory usage under the assumption that overflowing to disk is allowed during data frame analytics. expected_memory_with_disk is usually smaller than expected_memory_without_disk as using disk allows to limit the main memory needed to perform data frame analytics.

      • Estimated memory usage under the assumption that the whole data frame analytics should happen in memory (i.e. without overflowing to disk).

POST /_ml/data_frame/analytics/{id}/_explain
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/{id}/_explain' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"source\": {\n    \"index\": \"houses_sold_last_10_yrs\"\n  },\n  \"analysis\": {\n    \"regression\": {\n      \"dependent_variable\": \"price\"\n    }\n  }\n}"'
Request example
Run `POST _ml/data_frame/analytics/_explain` to explain a data frame analytics job configuration.
{
  "source": {
    "index": "houses_sold_last_10_yrs"
  },
  "analysis": {
    "regression": {
      "dependent_variable": "price"
    }
  }
}
Response examples (200)
A succesful response for explaining a data frame analytics job configuration.
{
  "field_selection": [
    {
      "field": "number_of_bedrooms",
      "mappings_types": [
        "integer"
      ],
      "is_included": true,
      "is_required": false,
      "feature_type": "numerical"
    },
    {
      "field": "postcode",
      "mappings_types": [
        "text"
      ],
      "is_included": false,
      "is_required": false,
      "reason": "[postcode.keyword] is preferred because it is aggregatable"
    },
    {
      "field": "postcode.keyword",
      "mappings_types": [
        "keyword"
      ],
      "is_included": true,
      "is_required": false,
      "feature_type": "categorical"
    },
    {
      "field": "price",
      "mappings_types": [
        "float"
      ],
      "is_included": true,
      "is_required": true,
      "feature_type": "numerical"
    }
  ],
  "memory_estimation": {
    "expected_memory_without_disk": "128MB",
    "expected_memory_with_disk": "32MB"
  }
}




































































































Migration