- Java REST Client (deprecated): other versions:
- Overview
- Java Low Level REST Client
- Java High Level REST Client
- Getting started
- Document APIs
- Search APIs
- Async Search APIs
- Miscellaneous APIs
- Index APIs
- Analyze API
- Create Index API
- Delete Index API
- Index Exists API
- Open Index API
- Close Index API
- Shrink Index API
- Split Index API
- Clone Index API
- Refresh API
- Flush API
- Flush Synced API
- Clear Cache API
- Force Merge API
- Rollover Index API
- Put Mapping API
- Get Mappings API
- Get Field Mappings API
- Index Aliases API
- Delete Alias API
- Exists Alias API
- Get Alias API
- Update Indices Settings API
- Get Settings API
- Put Template API
- Validate Query API
- Get Templates API
- Templates Exist API
- Get Index API
- Freeze Index API
- Unfreeze Index API
- Delete Template API
- Reload Search Analyzers API
- Get Composable Index Templates API
- Put Composable Index Template API
- Delete Composable Index Template API
- Optional arguments
- Simulate Index Template API
- Cluster APIs
- Ingest APIs
- Snapshot APIs
- Tasks APIs
- Script APIs
- Licensing APIs
- Machine Learning APIs
- Put anomaly detection job API
- Get anomaly detection jobs API
- Delete anomaly detection job API
- Open anomaly detection job API
- Close anomaly detection job API
- Update anomaly detection job API
- Flush Job API
- Put datafeed API
- Update datafeed API
- Get datafeed API
- Delete datafeed API
- Preview Datafeed API
- Start datafeed API
- Stop Datafeed API
- Get datafeed stats API
- Get anomaly detection job stats API
- Forecast Job API
- Delete Forecast API
- Get buckets API
- Get overall buckets API
- Get records API
- Post Data API
- Get influencers API
- Get categories API
- Get calendars API
- Put calendar API
- Get calendar events API
- Post Calendar Event API
- Delete calendar event API
- Put anomaly detection jobs in calendar API
- Delete anomaly detection jobs from calendar API
- Delete calendar API
- Estimate anomaly detection job model memory API
- Get data frame analytics jobs API
- Get data frame analytics jobs stats API
- Put data frame analytics jobs API
- Delete data frame analytics jobs API
- Start data frame analytics jobs API
- Stop data frame analytics jobs API
- Evaluate data frame analytics API
- Explain data frame analytics API
- Get trained models API
- Put trained model API
- Get trained models stats API
- Delete trained model API
- Put Filter API
- Get filters API
- Update filter API
- Delete Filter API
- Get model snapshots API
- Delete Model Snapshot API
- Revert Model Snapshot API
- Update model snapshot API
- ML get info API
- Delete Expired Data API
- Set Upgrade Mode API
- Migration APIs
- Rollup APIs
- Security APIs
- Put User API
- Get Users API
- Delete User API
- Enable User API
- Disable User API
- Change Password API
- Put Role API
- Get Roles API
- Delete Role API
- Delete Privileges API
- Get Builtin Privileges API
- Get Privileges API
- Clear Roles Cache API
- Clear Realm Cache API
- Authenticate API
- Has Privileges API
- Get User Privileges API
- SSL Certificate API
- Put Role Mapping API
- Get Role Mappings API
- Delete Role Mapping API
- Create Token API
- Invalidate Token API
- Put Privileges API
- Create API Key API
- Get API Key information API
- Invalidate API Key API
- Watcher APIs
- Graph APIs
- CCR APIs
- Index Lifecycle Management APIs
- Snapshot Lifecycle Management APIs
- Put Snapshot Lifecycle Policy API
- Delete Snapshot Lifecycle Policy API
- Get Snapshot Lifecycle Policy API
- Start Snapshot Lifecycle Management API
- Stop Snapshot Lifecycle Management API
- Snapshot Lifecycle Management Status API
- Execute Snapshot Lifecycle Policy API
- Execute Snapshot Lifecycle Retention API
- Transform APIs
- Enrich APIs
- Using Java Builders
- Migration Guide
- License
Reindex API
editReindex API
editReindex Request
editA ReindexRequest
can be used to copy documents from one or more indexes into a
destination index.
It requires an existing source index and a target index which may or may not exist pre-request. Reindex does not attempt to set up the destination index. It does not copy the settings of the source index. You should set up the destination index prior to running a _reindex action, including setting up mappings, shard counts, replicas, etc.
The simplest form of a ReindexRequest
looks like this:
ReindexRequest request = new ReindexRequest(); request.setSourceIndices("source1", "source2"); request.setDestIndex("dest");
The dest
element can be configured like the index API to control optimistic concurrency control. Just leaving out
versionType
(as above) or setting it to internal will cause Elasticsearch to blindly dump documents into the target.
Setting versionType
to external will cause Elasticsearch to preserve the version from the source, create any documents
that are missing, and update any documents that have an older version in the destination index than they do in the
source index.
Setting opType
to create
will cause _reindex
to only create missing documents in the target index. All existing
documents will cause a version conflict. The default opType
is index
.
By default version conflicts abort the _reindex
process but you can just count
them instead with:
You can limit the documents by adding a query.
It’s also possible to limit the number of processed documents by setting maxDocs
.
By default _reindex
uses batches of 1000. You can change the batch size with sourceBatchSize
.
Reindex can also use the ingest feature by specifying a pipeline
.
ReindexRequest
also supports a script
that modifies the document. It allows you to
also change the document’s metadata. The following example illustrates that.
request.setScript( new Script( ScriptType.INLINE, "painless", "if (ctx._source.user == 'kimchy') {ctx._source.likes++;}", Collections.emptyMap()));
ReindexRequest
supports reindexing from a remote Elasticsearch cluster. When using a remote cluster the query should be
specified inside the RemoteInfo
object and not using setSourceQuery
. If both the remote info and the source query are
set it results in a validation error during the request. The reason for this is that the remote Elasticsearch may not
understand queries built by the modern query builders. The remote cluster support works all the way back to Elasticsearch
0.90 and the query language has changed since then. When reaching older versions, it is safer to write the query by hand
in JSON.
request.setRemoteInfo( new RemoteInfo( "http", remoteHost, remotePort, null, new BytesArray(new MatchAllQueryBuilder().toString()), user, password, Collections.emptyMap(), new TimeValue(100, TimeUnit.MILLISECONDS), new TimeValue(100, TimeUnit.SECONDS) ) );
ReindexRequest
also helps in automatically parallelizing using sliced-scroll
to
slice on _id
. Use setSlices
to specify the number of slices to use.
ReindexRequest
uses the scroll
parameter to control how long it keeps the
"search context" alive.
Optional arguments
editIn addition to the options above the following arguments can optionally be also provided:
Synchronous execution
editWhen executing a ReindexRequest
in the following manner, the client waits
for the BulkByScrollResponse
to be returned before continuing with code execution:
BulkByScrollResponse bulkResponse = client.reindex(request, RequestOptions.DEFAULT);
Synchronous calls may throw an IOException
in case of either failing to
parse the REST response in the high-level REST client, the request times out
or similar cases where there is no response coming back from the server.
In cases where the server returns a 4xx
or 5xx
error code, the high-level
client tries to parse the response body error details instead and then throws
a generic ElasticsearchException
and adds the original ResponseException
as a
suppressed exception to it.
Asynchronous execution
editExecuting a ReindexRequest
can also be done in an asynchronous fashion so that
the client can return directly. Users need to specify how the response or
potential failures will be handled by passing the request and a listener to the
asynchronous reindex method:
The asynchronous method does not block and returns immediately. Once it is
completed the ActionListener
is called back using the onResponse
method
if the execution successfully completed or using the onFailure
method if
it failed. Failure scenarios and expected exceptions are the same as in the
synchronous execution case.
A typical listener for reindex
looks like:
Reindex task submission
editIt is also possible to submit a ReindexRequest
and not wait for it completion with the use of Task API. This is an equivalent of a REST request
with wait_for_completion flag set to false.
ReindexRequest reindexRequest = new ReindexRequest(); reindexRequest.setSourceIndices(sourceIndex); reindexRequest.setDestIndex(destinationIndex); reindexRequest.setRefresh(true); TaskSubmissionResponse reindexSubmission = highLevelClient() .submitReindexTask(reindexRequest, RequestOptions.DEFAULT); String taskId = reindexSubmission.getTask();
Reindex Response
editThe returned BulkByScrollResponse
contains information about the executed operations and
allows to iterate over each result as follows:
TimeValue timeTaken = bulkResponse.getTook(); boolean timedOut = bulkResponse.isTimedOut(); long totalDocs = bulkResponse.getTotal(); long updatedDocs = bulkResponse.getUpdated(); long createdDocs = bulkResponse.getCreated(); long deletedDocs = bulkResponse.getDeleted(); long batches = bulkResponse.getBatches(); long noops = bulkResponse.getNoops(); long versionConflicts = bulkResponse.getVersionConflicts(); long bulkRetries = bulkResponse.getBulkRetries(); long searchRetries = bulkResponse.getSearchRetries(); TimeValue throttledMillis = bulkResponse.getStatus().getThrottled(); TimeValue throttledUntilMillis = bulkResponse.getStatus().getThrottledUntil(); List<ScrollableHitSource.SearchFailure> searchFailures = bulkResponse.getSearchFailures(); List<BulkItemResponse.Failure> bulkFailures = bulkResponse.getBulkFailures();
Get total time taken |
|
Check if the request timed out |
|
Get total number of docs processed |
|
Number of docs that were updated |
|
Number of docs that were created |
|
Number of docs that were deleted |
|
Number of batches that were executed |
|
Number of skipped docs |
|
Number of version conflicts |
|
Number of times request had to retry bulk index operations |
|
Number of times request had to retry search operations |
|
The total time this request has throttled itself not including the current throttle time if it is currently sleeping |
|
Remaining delay of any current throttle sleep or 0 if not sleeping |
|
Failures during search phase |
|
Failures during bulk index operation |
On this page