Search API
editSearch API
editSearch Request
editThe SearchRequest
is used for any operation that has to do with searching
documents, aggregations, suggestions and also offers ways of requesting
highlighting on the resulting documents.
In its most basic form, we can add a query to the request:
SearchRequest searchRequest = new SearchRequest(); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(QueryBuilders.matchAllQuery()); searchRequest.source(searchSourceBuilder);
Creates the |
|
Most search parameters are added to the |
|
Add a |
|
Add the |
Optional arguments
editLet’s first look at some of the optional arguments of a SearchRequest
:
There are a couple of other interesting optional parameters:
Setting |
Using the SearchSourceBuilder
editMost options controlling the search behavior can be set on the
SearchSourceBuilder
,
which contains more or less the equivalent of the options in the search request
body of the Rest API.
Here are a few examples of some common options:
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.termQuery("user", "kimchy")); sourceBuilder.from(0); sourceBuilder.size(5); sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
Create a |
|
Set the query. Can be any type of |
|
Set the |
|
Set the |
|
Set an optional timeout that controls how long the search is allowed to take. |
After this, the SearchSourceBuilder
only needs to be added to the
SearchRequest
:
SearchRequest searchRequest = new SearchRequest(); searchRequest.source(sourceBuilder);
Building queries
editSearch queries are created using QueryBuilder
objects. A QueryBuilder
exists
for every search query type supported by Elasticsearch’s Query DSL.
A QueryBuilder
can be created using its constructor:
Create a full text Match Query that matches the text "kimchy" over the field "user". |
Once created, the QueryBuilder
object provides methods to configure the options
of the search query it creates:
matchQueryBuilder.fuzziness(Fuzziness.AUTO); matchQueryBuilder.prefixLength(3); matchQueryBuilder.maxExpansions(10);
Enable fuzzy matching on the match query |
|
Set the prefix length option on the match query |
|
Set the max expansion options to control the fuzzy process of the query |
QueryBuilder
objects can also be created using the QueryBuilders
utility class.
This class provides helper methods that can be used to create QueryBuilder
objects
using a fluent programming style:
QueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("user", "kimchy") .fuzziness(Fuzziness.AUTO) .prefixLength(3) .maxExpansions(10);
Whatever the method used to create it, the QueryBuilder
object must be added
to the SearchSourceBuilder
as follows:
searchSourceBuilder.query(matchQueryBuilder);
The Building Queries page gives a list of all available search queries with
their corresponding QueryBuilder
objects and QueryBuilders
helper methods.
Specifying Sorting
editThe SearchSourceBuilder
allows to add one or more SortBuilder
instances. There are four special implementations (Field-, Score-, GeoDistance- and ScriptSortBuilder).
Source filtering
editBy default, search requests return the contents of the document _source
but like in the Rest API you can overwrite this behavior. For example, you can turn off _source
retrieval completely:
sourceBuilder.fetchSource(false);
The method also accepts an array of one or more wildcard patterns to control which fields get included or excluded in a more fine grained way:
String[] includeFields = new String[] {"title", "user", "innerObject.*"}; String[] excludeFields = new String[] {"_type"}; sourceBuilder.fetchSource(includeFields, excludeFields);
Requesting Highlighting
editHighlighting search results can be achieved by setting a HighlightBuilder
on the
SearchSourceBuilder
. Different highlighting behaviour can be defined for each
fields by adding one or more HighlightBuilder.Field
instances to a HighlightBuilder
.
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); HighlightBuilder highlightBuilder = new HighlightBuilder(); HighlightBuilder.Field highlightTitle = new HighlightBuilder.Field("title"); highlightTitle.highlighterType("unified"); highlightBuilder.field(highlightTitle); HighlightBuilder.Field highlightUser = new HighlightBuilder.Field("user"); highlightBuilder.field(highlightUser); searchSourceBuilder.highlighter(highlightBuilder);
Creates a new |
|
Create a field highlighter for the |
|
Set the field highlighter type |
|
Add the field highlighter to the highlight builder |
There are many options which are explained in detail in the Rest API documentation. The Rest
API parameters (e.g. pre_tags
) are usually changed by
setters with a similar name (e.g. #preTags(String ...)
).
Highlighted text fragments can later be retrieved from the SearchResponse
.
Requesting Aggregations
editAggregations can be added to the search by first creating the appropriate
AggregationBuilder
and then setting it on the SearchSourceBuilder
. In the
following example we create a terms
aggregation on company names with a
sub-aggregation on the average age of employees in the company:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); TermsAggregationBuilder aggregation = AggregationBuilders.terms("by_company") .field("company.keyword"); aggregation.subAggregation(AggregationBuilders.avg("average_age") .field("age")); searchSourceBuilder.aggregation(aggregation);
The Building Aggregations page gives a list of all available aggregations with
their corresponding AggregationBuilder
objects and AggregationBuilders
helper methods.
We will later see how to access aggregations in the SearchResponse
.
Requesting Suggestions
editTo add Suggestions to the search request, use one of the SuggestionBuilder
implementations
that are easily accessible from the SuggestBuilders
factory class. Suggestion builders
need to be added to the top level SuggestBuilder
, which itself can be set on the SearchSourceBuilder
.
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); SuggestionBuilder termSuggestionBuilder = SuggestBuilders.termSuggestion("user").text("kmichy"); SuggestBuilder suggestBuilder = new SuggestBuilder(); suggestBuilder.addSuggestion("suggest_user", termSuggestionBuilder); searchSourceBuilder.suggest(suggestBuilder);
Creates a new |
|
Adds the suggestion builder and names it |
We will later see how to retrieve suggestions from the
SearchResponse
.
Profiling Queries and Aggregations
editThe Profile API can be used to profile the execution of queries and aggregations for
a specific search request. in order to use it, the profile flag must be set to true on the SearchSourceBuilder
:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.profile(true);
Once the SearchRequest
is executed the corresponding SearchResponse
will
contain the profiling results.
Synchronous Execution
editWhen executing a SearchRequest
in the following manner, the client waits
for the SearchResponse
to be returned before continuing with code execution:
SearchResponse searchResponse = client.search(searchRequest);
Asynchronous Execution
editExecuting a SearchRequest
can also be done in an asynchronous fashion so that
the client can return directly. Users need to specify how the response or
potential failures will be handled by passing the request and a listeners to the
asynchronous search method:
The asynchronous method does not block and returns immediately. Once it is
completed the ActionListener
is called back using the onResponse
method
if the execution successfully completed or using the onFailure
method if
it failed.
A typical listener for SearchResponse
looks like:
SearchResponse
editThe SearchResponse
that is returned by executing the search provides details
about the search execution itself as well as access to the documents returned.
First, there is useful information about the request execution itself, like the
HTTP status code, execution time or whether the request terminated early or timed
out:
RestStatus status = searchResponse.status(); TimeValue took = searchResponse.getTook(); Boolean terminatedEarly = searchResponse.isTerminatedEarly(); boolean timedOut = searchResponse.isTimedOut();
Second, the response also provides information about the execution on the
shard level by offering statistics about the total number of shards that were
affected by the search, and the successful vs. unsuccessful shards. Possible
failures can also be handled by iterating over an array off
ShardSearchFailures
like in the following example:
int totalShards = searchResponse.getTotalShards(); int successfulShards = searchResponse.getSuccessfulShards(); int failedShards = searchResponse.getFailedShards(); for (ShardSearchFailure failure : searchResponse.getShardFailures()) { // failures should be handled here }
Retrieving SearchHits
editTo get access to the returned documents, we need to first get the SearchHits
contained in the response:
SearchHits hits = searchResponse.getHits();
The SearchHits
provides global information about all hits, like total number
of hits or the maximum score:
long totalHits = hits.getTotalHits(); float maxScore = hits.getMaxScore();
Nested inside the SearchHits
are the individual search results that can
be iterated over:
SearchHit[] searchHits = hits.getHits(); for (SearchHit hit : searchHits) { // do something with the SearchHit }
The SearchHit
provides access to basic information like index, type, docId and
score of each search hit:
String index = hit.getIndex(); String type = hit.getType(); String id = hit.getId(); float score = hit.getScore();
Furthermore, it lets you get back the document source, either as a simple JSON-String or as a map of key/value pairs. In this map, regular fields are keyed by the field name and contain the field value. Multi-valued fields are returned as lists of objects, nested objects as another key/value map. These cases need to be cast accordingly:
String sourceAsString = hit.getSourceAsString(); Map<String, Object> sourceAsMap = hit.getSourceAsMap(); String documentTitle = (String) sourceAsMap.get("title"); List<Object> users = (List<Object>) sourceAsMap.get("user"); Map<String, Object> innerObject = (Map<String, Object>) sourceAsMap.get("innerObject");
Retrieving Highlighting
editIf requested, highlighted text fragments can be retrieved from each SearchHit
in the result. The hit object offers
access to a map of field names to HighlightField
instances, each of which contains one
or many highlighted text fragments:
SearchHits hits = searchResponse.getHits(); for (SearchHit hit : hits.getHits()) { Map<String, HighlightField> highlightFields = hit.getHighlightFields(); HighlightField highlight = highlightFields.get("title"); Text[] fragments = highlight.fragments(); String fragmentString = fragments[0].string(); }
Retrieving Aggregations
editAggregations can be retrieved from the SearchResponse
by first getting the
root of the aggregation tree, the Aggregations
object, and then getting the
aggregation by name.
Aggregations aggregations = searchResponse.getAggregations(); Terms byCompanyAggregation = aggregations.get("by_company"); Bucket elasticBucket = byCompanyAggregation.getBucketByKey("Elastic"); Avg averageAge = elasticBucket.getAggregations().get("average_age"); double avg = averageAge.getValue();
Get the |
|
Get the buckets that is keyed with |
|
Get the |
Note that if you access aggregations by name, you need to specify the
aggregation interface according to the type of aggregation you requested,
otherwise a ClassCastException
will be thrown:
This will throw an exception because "by_company" is a |
It is also possible to access all aggregations as a map that is keyed by the aggregation name. In this case, the cast to the proper aggregation interface needs to happen explicitly:
Map<String, Aggregation> aggregationMap = aggregations.getAsMap(); Terms companyAggregation = (Terms) aggregationMap.get("by_company");
There are also getters that return all top level aggregations as a list:
List<Aggregation> aggregationList = aggregations.asList();
And last but not least you can iterate over all aggregations and then e.g. decide how to further process them based on their type:
for (Aggregation agg : aggregations) { String type = agg.getType(); if (type.equals(TermsAggregationBuilder.NAME)) { Bucket elasticBucket = ((Terms) agg).getBucketByKey("Elastic"); long numberOfDocs = elasticBucket.getDocCount(); } }
Retrieving Suggestions
editTo get back the suggestions from a SearchResponse
, use the Suggest
object as an entry point and then retrieve the nested suggestion objects:
Retrieving Profiling Results
editProfiling results are retrieved from a SearchResponse
using the getProfileResults()
method. This
method returns a Map
containing a ProfileShardResult
object for every shard involved in the
SearchRequest
execution. ProfileShardResult
are stored in the Map
using a key that uniquely
identifies the shard the profile result corresponds to.
Here is a sample code that shows how to iterate over all the profiling results of every shard:
Map<String, ProfileShardResult> profilingResults = searchResponse.getProfileResults(); for (Map.Entry<String, ProfileShardResult> profilingResult : profilingResults.entrySet()) { String key = profilingResult.getKey(); ProfileShardResult profileShardResult = profilingResult.getValue(); }
Retrieve the |
|
Profiling results can be retrieved by shard’s key if the key is known, otherwise it might be simpler to iterate over all the profiling results |
|
Retrieve the key that identifies which shard the |
|
Retrieve the |
The ProfileShardResult
object itself contains one or more query profile results, one for each query
executed against the underlying Lucene index:
List<QueryProfileShardResult> queryProfileShardResults = profileShardResult.getQueryProfileResults(); for (QueryProfileShardResult queryProfileResult : queryProfileShardResults) { }
Each QueryProfileShardResult
gives access to the detailed query tree execution, returned as a list of
ProfileResult
objects:
for (ProfileResult profileResult : queryProfileResult.getQueryResults()) { String queryName = profileResult.getQueryName(); long queryTimeInMillis = profileResult.getTime(); List<ProfileResult> profiledChildren = profileResult.getProfiledChildren(); }
Iterate over the profile results |
|
Retrieve the name of the Lucene query |
|
Retrieve the time in millis spent executing the Lucene query |
|
Retrieve the profile results for the sub-queries (if any) |
The Rest API documentation contains more information about Profiling Queries with a description of the query profiling information
The QueryProfileShardResult
also gives access to the profiling information for the Lucene collectors:
CollectorResult collectorResult = queryProfileResult.getCollectorResult(); String collectorName = collectorResult.getName(); Long collectorTimeInMillis = collectorResult.getTime(); List<CollectorResult> profiledChildren = collectorResult.getProfiledChildren();
Retrieve the profiling result of the Lucene collector |
|
Retrieve the name of the Lucene collector |
|
Retrieve the time in millis spent executing the Lucene collector |
|
Retrieve the profile results for the sub-collectors (if any) |
The Rest API documentation contains more information about profiling information for Lucene collectors.
In a very similar manner to the query tree execution, the QueryProfileShardResult
objects gives access
to the detailed aggregations tree execution:
AggregationProfileShardResult aggsProfileResults = profileShardResult.getAggregationProfileResults(); for (ProfileResult profileResult : aggsProfileResults.getProfileResults()) { String aggName = profileResult.getQueryName(); long aggTimeInMillis = profileResult.getTime(); List<ProfileResult> profiledChildren = profileResult.getProfiledChildren(); }
Retrieve the |
|
Iterate over the aggregation profile results |
|
Retrieve the type of the aggregation (corresponds to Java class used to execute the aggregation) |
|
Retrieve the time in millis spent executing the Lucene collector |
|
Retrieve the profile results for the sub-aggregations (if any) |
The Rest API documentation contains more information about Profiling Aggregations