Tutorial: Migrate ILM managed data stream to data stream lifecycle
editTutorial: Migrate ILM managed data stream to data stream lifecycle
editIn this tutorial we’ll look at migrating an existing data stream from Index Lifecycle Management (ILM) to data stream lifecycle. The existing ILM managed backing indices will continue to be managed by ILM until they age out and get deleted by ILM; however, the new backing indices will be managed by data stream lifecycle. This way, a data stream is gradually migrated away from being managed by ILM to being managed by data stream lifecycle. As we’ll see, ILM and data stream lifecycle can co-manage a data stream; however, an index can only be managed by one system at a time.
TL;DR
editTo migrate a data stream from ILM to data stream lifecycle we’ll have to execute two steps:
-
Update the index template that’s backing the data stream to set prefer_ilm
to
false
, and to configure data stream lifecycle. - Configure the data stream lifecycle for the existing data stream using the lifecycle API.
For more details see the migrate to data stream lifecycle section.
Setup ILM managed data stream
editLet’s first create a data stream with two backing indices managed by ILM. We first create an ILM policy:
resp = client.ilm.put_lifecycle( name="pre-dsl-ilm-policy", policy={ "phases": { "hot": { "actions": { "rollover": { "max_primary_shard_size": "50gb" } } }, "delete": { "min_age": "7d", "actions": { "delete": {} } } } }, ) print(resp)
response = client.ilm.put_lifecycle( policy: 'pre-dsl-ilm-policy', body: { policy: { phases: { hot: { actions: { rollover: { max_primary_shard_size: '50gb' } } }, delete: { min_age: '7d', actions: { delete: {} } } } } } ) puts response
const response = await client.ilm.putLifecycle({ name: "pre-dsl-ilm-policy", policy: { phases: { hot: { actions: { rollover: { max_primary_shard_size: "50gb", }, }, }, delete: { min_age: "7d", actions: { delete: {}, }, }, }, }, }); console.log(response);
PUT _ilm/policy/pre-dsl-ilm-policy { "policy": { "phases": { "hot": { "actions": { "rollover": { "max_primary_shard_size": "50gb" } } }, "delete": { "min_age": "7d", "actions": { "delete": {} } } } } }
And let’s create an index template that’ll back the data stream and configures ILM:
resp = client.indices.put_index_template( name="dsl-data-stream-template", index_patterns=[ "dsl-data-stream*" ], data_stream={}, priority=500, template={ "settings": { "index.lifecycle.name": "pre-dsl-ilm-policy" } }, ) print(resp)
response = client.indices.put_index_template( name: 'dsl-data-stream-template', body: { index_patterns: [ 'dsl-data-stream*' ], data_stream: {}, priority: 500, template: { settings: { 'index.lifecycle.name' => 'pre-dsl-ilm-policy' } } } ) puts response
const response = await client.indices.putIndexTemplate({ name: "dsl-data-stream-template", index_patterns: ["dsl-data-stream*"], data_stream: {}, priority: 500, template: { settings: { "index.lifecycle.name": "pre-dsl-ilm-policy", }, }, }); console.log(response);
PUT _index_template/dsl-data-stream-template { "index_patterns": ["dsl-data-stream*"], "data_stream": { }, "priority": 500, "template": { "settings": { "index.lifecycle.name": "pre-dsl-ilm-policy" } } }
We’ll now index a document targetting dsl-data-stream
to create the data stream
and we’ll also manually rollover the data stream to have another generation index created:
resp = client.index( index="dsl-data-stream", document={ "@timestamp": "2023-10-18T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" }, ) print(resp)
response = client.index( index: 'dsl-data-stream', body: { "@timestamp": '2023-10-18T16:21:15.000Z', message: '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736' } ) puts response
const response = await client.index({ index: "dsl-data-stream", document: { "@timestamp": "2023-10-18T16:21:15.000Z", message: '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736', }, }); console.log(response);
POST dsl-data-stream/_doc? { "@timestamp": "2023-10-18T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" }
resp = client.indices.rollover( alias="dsl-data-stream", ) print(resp)
response = client.indices.rollover( alias: 'dsl-data-stream' ) puts response
const response = await client.indices.rollover({ alias: "dsl-data-stream", }); console.log(response);
POST dsl-data-stream/_rollover
We’ll use the GET _data_stream API to inspect the state of the data stream:
resp = client.indices.get_data_stream( name="dsl-data-stream", ) print(resp)
response = client.indices.get_data_stream( name: 'dsl-data-stream' ) puts response
const response = await client.indices.getDataStream({ name: "dsl-data-stream", }); console.log(response);
GET _data_stream/dsl-data-stream
Inspecting the response we’ll see that both backing indices are managed by ILM and that the next generation index will also be managed by ILM:
{ "data_streams": [ { "name": "dsl-data-stream", "timestamp_field": { "name": "@timestamp" }, "indices": [ { "index_name": ".ds-dsl-data-stream-2023.10.19-000001", "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000002", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" } ], "generation": 2, "status": "GREEN", "template": "dsl-data-stream-template", "next_generation_managed_by": "Index Lifecycle Management", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "hidden": false, "system": false, "allow_custom_routing": false, "replicated": false, "rollover_on_write": false } ] }
The name of the backing index. |
|
For each backing index we display the value of the prefer_ilm configuration which will indicate if ILM takes precedence over data stream lifecycle in case both systems are configured for an index. |
|
The ILM policy configured for this index. |
|
The system that manages this index (possible values are "Index Lifecycle Management", "Data stream lifecycle", or "Unmanaged") |
|
The system that will manage the next generation index (the new write index of this data stream, once the data stream is rolled over). The possible values are "Index Lifecycle Management", "Data stream lifecycle", or "Unmanaged". |
|
The prefer_ilm value configured in the index template
that’s backing the data stream. This value will be configured for all the new backing indices.
If it’s not configured in the index template the backing indices will receive the |
|
The ILM policy configured in the index template that’s backing this data stream (which will be configured on all the new backing indices, as long as it exists in the index template). |
Migrate data stream to data stream lifecycle
editTo migrate the dsl-data-stream
to data stream lifecycle we’ll have to execute
two steps:
-
Update the index template that’s backing the data stream to set prefer_ilm
to
false
, and to configure data stream lifecycle. -
Configure the data stream lifecycle for the existing
dsl-data-stream
using the lifecycle API.
The data stream lifecycle configuration that’s added to the index template,
being a data stream configuration, will only apply to new data streams.
Our data stream exists already, so even though we added a data stream lifecycle
configuration in the index template it will not be applied to dsl-data-stream
.
Let’s update the index template:
resp = client.indices.put_index_template( name="dsl-data-stream-template", index_patterns=[ "dsl-data-stream*" ], data_stream={}, priority=500, template={ "settings": { "index.lifecycle.name": "pre-dsl-ilm-policy", "index.lifecycle.prefer_ilm": False }, "lifecycle": { "data_retention": "7d" } }, ) print(resp)
response = client.indices.put_index_template( name: 'dsl-data-stream-template', body: { index_patterns: [ 'dsl-data-stream*' ], data_stream: {}, priority: 500, template: { settings: { 'index.lifecycle.name' => 'pre-dsl-ilm-policy', 'index.lifecycle.prefer_ilm' => false }, lifecycle: { data_retention: '7d' } } } ) puts response
const response = await client.indices.putIndexTemplate({ name: "dsl-data-stream-template", index_patterns: ["dsl-data-stream*"], data_stream: {}, priority: 500, template: { settings: { "index.lifecycle.name": "pre-dsl-ilm-policy", "index.lifecycle.prefer_ilm": false, }, lifecycle: { data_retention: "7d", }, }, }); console.log(response);
PUT _index_template/dsl-data-stream-template { "index_patterns": ["dsl-data-stream*"], "data_stream": { }, "priority": 500, "template": { "settings": { "index.lifecycle.name": "pre-dsl-ilm-policy", "index.lifecycle.prefer_ilm": false }, "lifecycle": { "data_retention": "7d" } } }
The |
|
We’re configuring the data stream lifecycle so new data streams will be managed by data stream lifecycle. |
We’ve now made sure that new data streams will be managed by data stream lifecycle.
Let’s update our existing dsl-data-stream
and configure data stream lifecycle:
resp = client.indices.put_data_lifecycle( name="dsl-data-stream", data_retention="7d", ) print(resp)
response = client.indices.put_data_lifecycle( name: 'dsl-data-stream', body: { data_retention: '7d' } ) puts response
const response = await client.indices.putDataLifecycle({ name: "dsl-data-stream", data_retention: "7d", }); console.log(response);
PUT _data_stream/dsl-data-stream/_lifecycle { "data_retention": "7d" }
We can inspect the data stream to check that the next generation will indeed be managed by data stream lifecycle:
resp = client.indices.get_data_stream( name="dsl-data-stream", ) print(resp)
response = client.indices.get_data_stream( name: 'dsl-data-stream' ) puts response
const response = await client.indices.getDataStream({ name: "dsl-data-stream", }); console.log(response);
GET _data_stream/dsl-data-stream
{ "data_streams": [ { "name": "dsl-data-stream", "timestamp_field": { "name": "@timestamp" }, "indices": [ { "index_name": ".ds-dsl-data-stream-2023.10.19-000001", "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000002", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" } ], "generation": 2, "status": "GREEN", "template": "dsl-data-stream-template", "lifecycle": { "enabled": true, "data_retention": "7d" }, "ilm_policy": "pre-dsl-ilm-policy", "next_generation_managed_by": "Data stream lifecycle", "prefer_ilm": false, "hidden": false, "system": false, "allow_custom_routing": false, "replicated": false, "rollover_on_write": false } ] }
The existing backing index will continue to be managed by ILM |
|
The existing backing index will continue to be managed by ILM |
|
The next generation index will be managed by Data stream lifecycle |
|
The |
We’ll now rollover the data stream to see the new generation index being managed by data stream lifecycle:
resp = client.indices.rollover( alias="dsl-data-stream", ) print(resp)
response = client.indices.rollover( alias: 'dsl-data-stream' ) puts response
const response = await client.indices.rollover({ alias: "dsl-data-stream", }); console.log(response);
POST dsl-data-stream/_rollover
resp = client.indices.get_data_stream( name="dsl-data-stream", ) print(resp)
response = client.indices.get_data_stream( name: 'dsl-data-stream' ) puts response
const response = await client.indices.getDataStream({ name: "dsl-data-stream", }); console.log(response);
GET _data_stream/dsl-data-stream
{ "data_streams": [ { "name": "dsl-data-stream", "timestamp_field": { "name": "@timestamp" }, "indices": [ { "index_name": ".ds-dsl-data-stream-2023.10.19-000001", "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000002", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000003", "index_uuid": "PA_JquKGSiKcAKBA8abcd1", "prefer_ilm": false, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Data stream lifecycle" } ], "generation": 3, "status": "GREEN", "template": "dsl-data-stream-template", "lifecycle": { "enabled": true, "data_retention": "7d" }, "ilm_policy": "pre-dsl-ilm-policy", "next_generation_managed_by": "Data stream lifecycle", "prefer_ilm": false, "hidden": false, "system": false, "allow_custom_routing": false, "replicated": false, "rollover_on_write": false } ] }
The backing indices that existed before rollover will continue to be managed by ILM |
|
The backing indices that existed before rollover will continue to be managed by ILM |
|
The new write index received the |
|
The new write index is managed by |
Migrate data stream back to ILM
editWe can easily change this data stream to be managed by ILM because we didn’t remove the ILM policy when we updated the index template.
We can achieve this in two ways:
- Delete the lifecycle from the data streams
-
Disable data stream lifecycle by configuring the
enabled
flag tofalse
.
Let’s implement option 2 and disable the data stream lifecycle:
resp = client.indices.put_data_lifecycle( name="dsl-data-stream", data_retention="7d", enabled=False, ) print(resp)
response = client.indices.put_data_lifecycle( name: 'dsl-data-stream', body: { data_retention: '7d', enabled: false } ) puts response
const response = await client.indices.putDataLifecycle({ name: "dsl-data-stream", data_retention: "7d", enabled: false, }); console.log(response);
The |
resp = client.indices.get_data_stream( name="dsl-data-stream", ) print(resp)
response = client.indices.get_data_stream( name: 'dsl-data-stream' ) puts response
const response = await client.indices.getDataStream({ name: "dsl-data-stream", }); console.log(response);
GET _data_stream/dsl-data-stream
{ "data_streams": [ { "name": "dsl-data-stream", "timestamp_field": { "name": "@timestamp" }, "indices": [ { "index_name": ".ds-dsl-data-stream-2023.10.19-000001", "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000002", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "prefer_ilm": true, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" }, { "index_name": ".ds-dsl-data-stream-2023.10.19-000003", "index_uuid": "PA_JquKGSiKcAKBA8abcd1", "prefer_ilm": false, "ilm_policy": "pre-dsl-ilm-policy", "managed_by": "Index Lifecycle Management" } ], "generation": 3, "status": "GREEN", "template": "dsl-data-stream-template", "lifecycle": { "enabled": false, "data_retention": "7d" }, "ilm_policy": "pre-dsl-ilm-policy", "next_generation_managed_by": "Index Lifecycle Management", "prefer_ilm": false, "hidden": false, "system": false, "allow_custom_routing": false, "replicated": false, "rollover_on_write": false } ] }
The write index is now managed by ILM |
|
The |
|
The next write index will be managed by ILM |
Had we removed the ILM policy from the index template when we updated
it, the write index of the data stream will now be Unmanaged
because the index
wouldn’t have the ILM policy configured to fallback onto.