Reindex from a remote cluster
editReindex from a remote cluster
editYou can use reindex from remote to migrate indices from your old cluster to a new 7.5.2 cluster. This enables you to move to 7.5.2 from a pre-6.8 cluster without interrupting service.
Elasticsearch provides backwards compatibility support that enables indices from the previous major version to be upgraded to the current major version. Skipping a major version means that you must resolve any backward compatibility issues yourself.
Elasticsearch does not support forward compatibility across major versions. For example, you cannot reindex from a 7.x cluster into a 6.x cluster.
If you use machine learning features and you’re migrating indices from a 6.5 or earlier cluster, the job and datafeed configuration information are not stored in an index. You must recreate your machine learning jobs in the new cluster. If you are migrating from a 6.6 or later cluster, it is a good idea to temporarily halt the tasks associated with your machine learning jobs and datafeeds to prevent inconsistencies between different machine learning indices that are reindexed at slightly different times. Use the set upgrade mode API or stop all datafeeds and close all machine learning jobs.
To migrate your indices:
-
Set up a new 7.5.2 cluster and add the existing cluster to the
reindex.remote.whitelist
inelasticsearch.yml
.reindex.remote.whitelist: oldhost:9200
The new cluster doesn’t have to start fully-scaled out. As you migrate indices and shift the load to the new cluster, you can add nodes to the new cluster and remove nodes from the old one.
-
For each index that you need to migrate to the new cluster:
-
Create an index the appropriate mappings and settings. Set the
refresh_interval
to-1
and setnumber_of_replicas
to0
for faster reindexing. -
Use the
reindex
API to pull documents from the remote index into the new 7.5.2 index:POST _reindex { "source": { "remote": { "host": "http://oldhost:9200", "username": "user", "password": "pass" }, "index": "source", "query": { "match": { "test": "data" } } }, "dest": { "index": "dest" } }
If you run the reindex job in the background by setting
wait_for_completion
tofalse
, the reindex request returns atask_id
you can use to monitor progress of the reindex job with the task API:GET _tasks/TASK_ID
. -
When the reindex job completes, set the
refresh_interval
andnumber_of_replicas
to the desired values (the default settings are30s
and1
). -
Once reindexing is complete and the status of the new index is
green
, you can delete the old index.
-
Create an index the appropriate mappings and settings. Set the