Tutorial: Create a data stream with a lifecycle
editTutorial: Create a data stream with a lifecycle
editTo create a data stream with a built-in lifecycle, follow these steps:
Create an index template
editA data stream requires a matching index template. You can configure the data stream lifecycle by
setting the lifecycle
field in the index template the same as you do for mappings and index settings. You can define an
index template that sets a lifecycle as follows:
-
Include the
data_stream
object to enable data streams. - Define the lifecycle in the template section or include a composable template that defines the lifecycle.
-
Use a priority higher than
200
to avoid collisions with built-in templates. See Avoid index pattern collisions.
You can use the create index template API.
response = client.indices.put_index_template( name: 'my-index-template', body: { index_patterns: [ 'my-data-stream*' ], data_stream: {}, priority: 500, template: { lifecycle: { data_retention: '7d' } }, _meta: { description: 'Template with data stream lifecycle' } } ) puts response
PUT _index_template/my-index-template { "index_patterns": ["my-data-stream*"], "data_stream": { }, "priority": 500, "template": { "lifecycle": { "data_retention": "7d" } }, "_meta": { "description": "Template with data stream lifecycle" } }
Create a data stream
editYou can create a data stream in two ways:
-
By manually creating the stream using the create data stream API. The stream’s name must still match one of your template’s index patterns.
response = client.indices.create_data_stream( name: 'my-data-stream' ) puts response
PUT _data_stream/my-data-stream
-
By indexing requests that target the stream’s name. This name must match one of your index template’s index patterns.
response = client.bulk( index: 'my-data-stream', body: [ { create: {} }, { "@timestamp": '2099-05-06T16:21:15.000Z', message: '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736' }, { create: {} }, { "@timestamp": '2099-05-06T16:25:42.000Z', message: '192.0.2.255 - - [06/May/2099:16:25:42 +0000] "GET /favicon.ico HTTP/1.0" 200 3638' } ] ) puts response
PUT my-data-stream/_bulk { "create":{ } } { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" } { "create":{ } } { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" }
Retrieve lifecycle information
editYou can use the get data stream lifecycle API to see the data stream lifecycle of your data stream and the explain data stream lifecycle API to see the exact state of each backing index.
response = client.indices.get_data_lifecycle( name: 'my-data-stream' ) puts response
GET _data_stream/my-data-stream/_lifecycle
The result will look like this:
{ "data_streams": [ { "name": "my-data-stream", "lifecycle": { "enabled": true, "data_retention": "7d" } } ] }
The name of your data stream. |
|
Shows if the data stream lifecycle is enabled for this data stream. |
|
The retention period of the data indexed in this data stream, this means that the data in this data stream will be kept at least for 7 days. After that Elasticsearch can delete it at its own discretion. |
If you want to see more information about how the data stream lifecycle is applied on individual backing indices use the explain data stream lifecycle API:
response = client.indices.explain_data_lifecycle( index: '.ds-my-data-stream-*' ) puts response
GET .ds-my-data-stream-*/_lifecycle/explain
The result will look like this: