Update By Query API
editUpdate By Query API
editUpdate By Query Request
editA UpdateByQueryRequest
can be used to update documents in an index.
It requires an existing index (or a set of indices) on which the update is to be performed.
The simplest form of a UpdateByQueryRequest
looks like this:
By default version conflicts abort the UpdateByQueryRequest
process but you can just
count them instead with:
You can limit the documents by adding a query.
It’s also possible to limit the number of processed documents by setting maxDocs
.
By default UpdateByQueryRequest
uses batches of 1000. You can change the batch size with
setBatchSize
.
Update by query can also use the ingest feature by specifying a pipeline
.
UpdateByQueryRequest
also supports a script
that modifies the document:
request.setScript( new Script( ScriptType.INLINE, "painless", "if (ctx._source.user == 'kimchy') {ctx._source.likes++;}", Collections.emptyMap()));
UpdateByQueryRequest
can be parallelized using sliced-scroll
with setSlices
:
UpdateByQueryRequest
uses the scroll
parameter to control how long it keeps the "search context" alive.
If you provide routing then the routing is copied to the scroll query, limiting the process to the shards that match that routing value.
Optional arguments
editIn addition to the options above the following arguments can optionally be also provided:
Synchronous execution
editWhen executing a UpdateByQueryRequest
in the following manner, the client waits
for the UpdateByQueryResponse
to be returned before continuing with code execution:
BulkByScrollResponse bulkResponse = client.updateByQuery(request, RequestOptions.DEFAULT);
Synchronous calls may throw an IOException
in case of either failing to
parse the REST response in the high-level REST client, the request times out
or similar cases where there is no response coming back from the server.
In cases where the server returns a 4xx
or 5xx
error code, the high-level
client tries to parse the response body error details instead and then throws
a generic ElasticsearchException
and adds the original ResponseException
as a
suppressed exception to it.
Asynchronous execution
editExecuting a UpdateByQueryRequest
can also be done in an asynchronous fashion so that
the client can return directly. Users need to specify how the response or
potential failures will be handled by passing the request and a listener to the
asynchronous update-by-query method:
The asynchronous method does not block and returns immediately. Once it is
completed the ActionListener
is called back using the onResponse
method
if the execution successfully completed or using the onFailure
method if
it failed. Failure scenarios and expected exceptions are the same as in the
synchronous execution case.
A typical listener for update-by-query
looks like:
Update By Query Response
editThe returned UpdateByQueryResponse
contains information about the executed operations and
allows to iterate over each result as follows:
TimeValue timeTaken = bulkResponse.getTook(); boolean timedOut = bulkResponse.isTimedOut(); long totalDocs = bulkResponse.getTotal(); long updatedDocs = bulkResponse.getUpdated(); long deletedDocs = bulkResponse.getDeleted(); long batches = bulkResponse.getBatches(); long noops = bulkResponse.getNoops(); long versionConflicts = bulkResponse.getVersionConflicts(); long bulkRetries = bulkResponse.getBulkRetries(); long searchRetries = bulkResponse.getSearchRetries(); TimeValue throttledMillis = bulkResponse.getStatus().getThrottled(); TimeValue throttledUntilMillis = bulkResponse.getStatus().getThrottledUntil(); List<ScrollableHitSource.SearchFailure> searchFailures = bulkResponse.getSearchFailures(); List<BulkItemResponse.Failure> bulkFailures = bulkResponse.getBulkFailures();
Get total time taken |
|
Check if the request timed out |
|
Get total number of docs processed |
|
Number of docs that were updated |
|
Number of docs that were deleted |
|
Number of batches that were executed |
|
Number of skipped docs |
|
Number of version conflicts |
|
Number of times request had to retry bulk index operations |
|
Number of times request had to retry search operations |
|
The total time this request has throttled itself not including the current throttle time if it is currently sleeping |
|
Remaining delay of any current throttle sleep or 0 if not sleeping |
|
Failures during search phase |
|
Failures during bulk index operation |