New

The executive guide to generative AI

Read more

Reindex data stream API

edit

These APIs are designed for indirect use by Kibana’s Upgrade Assistant. We strongly recommend you use the Upgrade Assistant to upgrade from 8.18 to 9.0.0-beta1. For upgrade instructions, refer to Upgrading to Elastic 9.0.0-beta1.

The reindex data stream API is used to upgrade the backing indices of a data stream to the most recent major version. It works by reindexing each backing index into a new index, then replacing the original backing index with its replacement and deleting the original backing index. The settings and mappings from the original backing indices are copied to the resulting backing indices.

This api runs in the background because reindexing all indices in a large data stream is expected to take a large amount of time and resources. The endpoint will return immediately and a persistent task will be created to run in the background. The current status of the task can be checked with the reindex status API. This status will be available for 24 hours after the task completes, whether it finished successfully or failed. However, only the last status is retained so re-running a reindex will overwrite the previous status for that data stream. A running or recently completed data stream reindex task can be cancelled using the reindex cancel API.

Request

edit

POST /_migration/reindex

Prerequisites

edit
  • If the Elasticsearch security features are enabled, you must have the manage index privilege for the data stream.

Request body

edit
source
index
(Required, string) The name of the data stream to upgrade.
mode
(Required, enum) Set to upgrade to upgrade the data stream in-place, using the same source and destination data stream. Each out-of-date backing index will be reindexed. Then the new backing index is swapped into the data stream and the old index is deleted. Currently, the only allowed value for this parameter is upgrade.

Settings

edit

You can use the following settings to control the behavior of the reindex data stream API:

migrate.max_concurrent_indices_reindexed_per_data_stream (Dynamic) The number of backing indices within a given data stream which will be reindexed concurrently. Defaults to 1.

migrate.data_stream_reindex_max_request_per_second (Dynamic) The average maximum number of documents within a given backing index to reindex per second. Defaults to 1000, though can be any decimal number greater than 0. To remove throttling, set to -1. This setting can be used to throttle the reindex process and manage resource usage. Consult the reindex throttle docs for more information.

Examples

edit

Assume we have a data stream my-data-stream with the following backing indices, all of which have index major version 7.x.

  • .ds-my-data-stream-2025.01.23-000001
  • .ds-my-data-stream-2025.01.23-000002
  • .ds-my-data-stream-2025.01.23-000003

Let’s also assume that .ds-my-data-stream-2025.01.23-000003 is the write index. If Elasticsearch is version 8.x and we wish to upgrade to major version 9.x, the version 7.x indices must be upgraded in preparation. We can use this API to reindex a data stream with version 7.x backing indices and make them version 8 backing indices.

Start by calling the API:

POST _migration/reindex
{
    "source": {
        "index": "my-data-stream"
    },
    "mode": "upgrade"
}

As this task runs in the background this API will return immediately. The task will do the following.

First, the data stream is rolled over. So that no documents are lost during the reindex, we add write blocks to the existing backing indices before reindexing them. Since a data stream’s write index cannot have a write block, the data stream is must be rolled over. This will produce a new write index, .ds-my-data-stream-2025.01.23-000004; which has an 8.x version and thus does not need to be upgraded.

Once the data stream has a write index with an 8.x version we can proceed with reindexing the old indices. For each of the version 7.x indices, we now do the following:

  • Add a write block to the source index to guarantee that no writes are lost.
  • Open the source index if it is closed.
  • Delete the destination index if one exists. This is done in case we are retrying after a failure, so that we start with a fresh index.
  • Create the destination index using the create from source API. This copies the settings and mappings from the old backing index to the new backing index.
  • Use the reindex API to copy the contents of the old backing index to the new backing index.
  • Close the destination index if the source index was originally closed.
  • Replace the old index in the data stream with the new index, using the modify data streams API.
  • Finally, the old backing index is deleted.

By default only one backing index will be processed at a time. This can be modified using the migrate_max_concurrent_indices_reindexed_per_data_stream-setting setting.

While the reindex data stream task is running, we can inspect the current status using the reindex status API:

GET /_migration/reindex/my-data-stream/_status

For the above example, the following would be a possible status:

{
  "start_time_millis": 1737676174349,
  "complete": false,
  "total_indices_in_data_stream": 4,
  "total_indices_requiring_upgrade": 3,
  "successes": 0,
  "in_progress": [
    {
      "index": ".ds-my-data-stream-2025.01.23-000001",
      "total_doc_count": 10000000,
      "reindexed_doc_count": 999999
    }
  ],
  "pending": 2,
  "errors": []
}

This output means that the first backing index, .ds-my-data-stream-2025.01.23-000001, is currently being processed, and none of the backing indices have yet completed. Notice that total_indices_in_data_stream has a value of 4, because after the rollover, there are 4 indices in the data stream. But the new write index has an 8.x version, and thus doesn’t need to be reindexed, so total_indices_requiring_upgrade is only 3.

Cancelling and Restarting

edit

The reindex datastream settings provide a few ways to control the performance and resource usage of a reindex task. This example shows how we can stop a running reindex task, modify the settings, and restart the task.

Continuing with the above example, assume the reindexing task has not yet completed, and the reindex status API returns the following:

{
  "start_time_millis": 1737676174349,
  "complete": false,
  "total_indices_in_data_stream": 4,
  "total_indices_requiring_upgrade": 3,
  "successes": 1,
  "in_progress": [
    {
      "index": ".ds-my-data-stream-2025.01.23-000002",
      "total_doc_count": 10000000,
      "reindexed_doc_count": 1000
    }
  ],
  "pending": 1,
  "errors": []
}

Let’s assume the task has been running for a long time. By default, we throttle how many requests the reindex operation can execute per second. This keeps the reindex process from consuming too many resources. But the default value of 1000 request per second will not be correct for all use cases. The migrate.data_stream_reindex_max_request_per_second setting can be used to increase or decrease the number of requests per second, or to remove the throttle entirely.

Changing this setting won’t have an effect on the backing index that is currently being reindexed. For example, changing the setting won’t have an effect on .ds-my-data-stream-2025.01.23-000002, but would have an effect on the next backing index.

But in the above status, .ds-my-data-stream-2025.01.23-000002 has values of 1000 and 10M for the reindexed_doc_count and total_doc_count, respectively. This means it has only reindexed 0.01% of the documents in the index. It might be a good time to cancel the run and optimize some settings without losing much work. So we call the cancel API:

POST /_migration/reindex/my-data-stream/_cancel

Now we can use the update cluster settings API to increase the throttle:

const response = await client.cluster.putSettings({
  persistent: {
    "migrate.data_stream_reindex_max_request_per_second": 10000,
  },
});
console.log(response);
PUT /_cluster/settings
{
  "persistent" : {
    "migrate.data_stream_reindex_max_request_per_second" : 10000
  }
}

The original reindex command can now be used to restart reindexing. Because the first backing index, .ds-my-data-stream-2025.01.23-000001, has already been reindexed and thus is already version 8.x, it will be skipped. The task will start by reindexing .ds-my-data-stream-2025.01.23-000002 again from the beginning.

Later, once all the backing indices have finished, the reindex status API will return something like the following:

{
  "start_time_millis": 1737676174349,
  "complete": true,
  "total_indices_in_data_stream": 4,
  "total_indices_requiring_upgrade": 2,
  "successes": 2,
  "in_progress": [],
  "pending": 0,
  "errors": []
}

Notice that the value of total_indices_requiring_upgrade is 2, unlike the previous status, which had a value of 3. This is because .ds-my-data-stream-2025.01.23-000001 was upgraded before the task cancellation. After the restart, the API sees that it does not need to be upgraded, thus does not include it in total_indices_requiring_upgrade or successes, despite the fact that it upgraded successfully.

The completed status will be accessible from the status API for 24 hours after completion of the task.

We can now check the data stream to verify that indices were upgraded:

const response = await client.indices.getDataStream({
  name: "my-data-stream",
  filter_path: "data_streams.indices.index_name",
});
console.log(response);
GET _data_stream/my-data-stream?filter_path=data_streams.indices.index_name

which returns:

{
  "data_streams": [
    {
      "indices": [
        {
          "index_name": ".migrated-ds-my-data-stream-2025.01.23-000003"
        },
        {
          "index_name": ".migrated-ds-my-data-stream-2025.01.23-000002"
        },
        {
          "index_name": ".migrated-ds-my-data-stream-2025.01.23-000001"
        },
        {
          "index_name": ".ds-my-data-stream-2025.01.23-000004"
        }
      ]
    }
  ]
}

Index .ds-my-data-stream-2025.01.23-000004 is the write index and didn’t need to be upgraded because it was created with version 8.x. The other three backing indices are now prefixed with .migrated because they have been upgraded.

We can now check the indices and verify that they have version 8.x:

const response = await client.indices.get({
  index: ".migrated-ds-my-data-stream-2025.01.23-000001",
  human: "true",
  filter_path: "*.settings.index.version.created_string",
});
console.log(response);
GET .migrated-ds-my-data-stream-2025.01.23-000001?human&filter_path=*.settings.index.version.created_string

which returns:

{
  ".migrated-ds-my-data-stream-2025.01.23-000001": {
    "settings": {
      "index": {
        "version": {
          "created_string": "8.18.0"
        }
      }
    }
  }
}

Handling Failures

edit

Since the reindex data stream API runs in the background, failure information can be obtained through the reindex status API. For example, if the backing index .ds-my-data-stream-2025.01.23-000002 was accidentally deleted by a user, we would see a status like the following:

{
  "start_time_millis": 1737676174349,
  "complete": false,
  "total_indices_in_data_stream": 4,
  "total_indices_requiring_upgrade": 3,
  "successes": 1,
  "in_progress": [],
  "pending": 1,
  "errors": [
    {
      "index": ".ds-my-data-stream-2025.01.23-000002",
      "message": "index [.ds-my-data-stream-2025.01.23-000002] does not exist"
    }
  ]
}

Once the issue has been fixed, the failed reindex task can be re-run. First, the failed run’s status must be cleared using the reindex cancel API. Then the original reindex command can be called to pick up where it left off.