- Java Transport Client (deprecated): other versions:
- Preface
- Maven Repository
- Deploying in JBoss EAP6 module
- Client
- Index API
- Get API
- Delete API
- Update API
- Bulk API
- Search API
- Count API
- Delete By Query API
- Facets
- Aggregations
- Percolate API
- Query DSL - Queries
- Match Query
- MultiMatch Query
- Boolean Query
- Boosting Query
- IDs Query
- Constant Score Query
- Disjunction Max Query
- Fuzzy Like This (Field) Query (flt and flt_field)
- FuzzyQuery
- Has Child / Has Parent
- MatchAll Query
- More Like This Query (mlt)
- Prefix Query
- QueryString Query
- Range Query
- Span Queries (first, near, not, or, term)
- Term Query
- Terms Query
- Top Children Query
- Wildcard Query
- Nested Query
- Indices Query
- GeoShape Query
- Query DSL - Filters
- And Filter
- Bool Filter
- Exists Filter
- Ids Filter
- Limit Filter
- Type Filter
- Geo Bounding Box Filter
- GeoDistance Filter
- Geo Distance Range Filter
- Geo Polygon Filter
- Geo Shape Filter
- Has Child / Has Parent Filters
- Match All Filter
- Missing Filter
- Not Filter
- Or Filter
- Prefix Filter
- Query Filter
- Range Filter
- Script Filter
- Term Filter
- Terms Filter
- Nested Filter
- Caching
Bulk API
editBulk API
editThe bulk API allows one to index and delete several documents in a single request. Here is a sample usage:
import static org.elasticsearch.common.xcontent.XContentFactory.*; BulkRequestBuilder bulkRequest = client.prepareBulk(); // either use client#prepare, or use Requests# to directly build index/delete requests bulkRequest.add(client.prepareIndex("twitter", "tweet", "1") .setSource(jsonBuilder() .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() ) ); bulkRequest.add(client.prepareIndex("twitter", "tweet", "2") .setSource(jsonBuilder() .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "another post") .endObject() ) ); BulkResponse bulkResponse = bulkRequest.execute().actionGet(); if (bulkResponse.hasFailures()) { // process failures by iterating through each bulk response item }
Using Bulk Processor
editThe BulkProcessor
class offers a simple interface to flush bulk operations automatically based on the number or size
of requests, or after a given period.
To use it, first create a BulkProcessor
instance:
import org.elasticsearch.action.bulk.BulkProcessor; BulkProcessor bulkProcessor = BulkProcessor.builder( client, new BulkProcessor.Listener() { @Override public void beforeBulk(long executionId, BulkRequest request) { ... } @Override public void afterBulk(long executionId, BulkRequest request, BulkResponse response) { ... } @Override public void afterBulk(long executionId, BulkRequest request, Throwable failure) { ... } }) .setBulkActions(10000) .setBulkSize(new ByteSizeValue(1, ByteSizeUnit.GB)) .setFlushInterval(TimeValue.timeValueSeconds(5)) .setConcurrentRequests(1) .build();
Add your elasticsearch client |
|
This method is called just before bulk is executed. You can for example see the numberOfActions with
|
|
This method is called after bulk execution. You can for example check if there was some failing requests
with |
|
This method is called when the bulk failed and raised a |
|
We want to execute the bulk every 10 000 requests |
|
We want to flush the bulk every 1gb |
|
We want to flush the bulk every 5 seconds whatever the number of requests |
|
Set the number of concurrent requests. A value of 0 means that only a single request will be allowed to be executed. A value of 1 means 1 concurrent request is allowed to be executed while accumulating new bulk requests. |
Then you can simply add your requests to the BulkProcessor
:
bulkProcessor.add(new IndexRequest("twitter", "tweet", "1").source(/* your doc here */)); bulkProcessor.add(new DeleteRequest("twitter", "tweet", "2"));
By default, BulkProcessor
:
-
sets bulkActions to
1000
-
sets bulkSize to
5mb
- does not set flushInterval
- sets concurrentRequests to 1
When all documents are loaded to the BulkProcessor
it can be closed by using awaitClose
or close
methods:
bulkProcessor.awaitClose(10, TimeUnit.MINUTES);
or
bulkProcessor.close();
Both methods flush any remaining documents and disable all other scheduled flushes if they were scheduled by setting
flushInterval
. If concurrent requests were enabled the awaitClose
method waits for up to the specified timeout for
all bulk requests to complete then returns true
, if the specified waiting time elapses before all bulk requests complete,
false
is returned. The close
method doesn’t wait for any remaining bulk requests to complete and exists immediately.
On this page