Create datafeeds API
editCreate datafeeds API
editInstantiates a datafeed.
Request
editPUT _ml/datafeeds/<feed_id>
Prerequisites
edit- You must create an anomaly detection job before you create a datafeed.
-
If Elasticsearch security features are enabled, you must have
manage_ml
ormanage
cluster privileges to use this API. See Security privileges.
Description
editYou can associate only one datafeed to each anomaly detection job.
You must use Kibana or this API to create a datafeed. Do not put a datafeed
directly to the .ml-config
index using the Elasticsearch index API.
If Elasticsearch security features are enabled, do not give users write
privileges on the .ml-config
index.
Path parameters
edit-
feed_id
- (Required, string) A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
Request body
edit-
aggregations
- (object) If set, the datafeed performs aggregation searches. For more information, see Datafeed resources.
-
chunking_config
- (object) Specifies how data searches are split into time chunks. See Chunking configuration objects.
-
delayed_data_check_config
- (object) Specifies whether the data feed checks for missing data and the size of the window. See Delayed data check configuration objects.
-
frequency
-
(Optional, time units) The interval at which scheduled queries
are made while the datafeed runs in real time. The default value is either the
bucket span for short bucket spans, or, for longer bucket spans, a sensible
fraction of the bucket span. For example:
150s
. -
indices
-
(Required, array) An array of index names. Wildcards are supported. For example:
["it_ops_metrics", "server*"]
. -
job_id
- (Required, string) A numerical character string that uniquely identifies the anomaly detection job.
-
query
-
(object) The Elasticsearch query domain-specific language (DSL). This value
corresponds to the query object in an Elasticsearch search POST body. All the
options that are supported by Elasticsearch can be used, as this object is
passed verbatim to Elasticsearch. By default, this property has the following
value:
{"match_all": {"boost": 1}}
. -
query_delay
-
(Optional, time units) The number of seconds behind real time
that data is queried. For example, if data from 10:04 a.m. might not be
searchable in Elasticsearch until 10:06 a.m., set this property to 120 seconds. The
default value is
60s
. -
script_fields
- (object) Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields. For more information, see Script Fields.
-
scroll_size
-
(unsigned integer) The
size
parameter that is used in Elasticsearch searches. The default value is1000
.
For more information about these properties, see Datafeed resources.
Security integration
editWhen Elasticsearch security features are enabled, your datafeed remembers which roles the user who created it had at the time of creation and runs the query using those same roles.
Examples
editThe following example creates the datafeed-total-requests
datafeed:
PUT _ml/datafeeds/datafeed-total-requests { "job_id": "total-requests", "indices": ["server-metrics"] }
When the datafeed is created, you receive the following results:
{ "datafeed_id": "datafeed-total-requests", "job_id": "total-requests", "query_delay": "83474ms", "indices": [ "server-metrics" ], "query": { "match_all": { "boost": 1.0 } }, "scroll_size": 1000, "chunking_config": { "mode": "auto" } }