Ranking Evaluation API
editRanking Evaluation API
editThe rankEval
method allows to evaluate the quality of ranked search
results over a set of search request. Given sets of manually rated
documents for each search request, ranking evaluation performs a
multi search request and calculates
information retrieval metrics like mean reciprocal rank, precision
or discounted cumulative gain on the returned results.
Ranking Evaluation Request
editIn order to build a RankEvalRequest
, you first need to create an
evaluation specification (RankEvalSpec
). This specification requires
to define the evaluation metric that is going to be calculated, as well
as a list of rated documents per search requests. Creating the ranking
evaluation request then takes the specification and a list of target
indices as arguments:
EvaluationMetric metric = new PrecisionAtK(); List<RatedDocument> ratedDocs = new ArrayList<>(); ratedDocs.add(new RatedDocument("posts", "1", 1)); SearchSourceBuilder searchQuery = new SearchSourceBuilder(); searchQuery.query(QueryBuilders.matchQuery("user", "kimchy")); RatedRequest ratedRequest = new RatedRequest("kimchy_query", ratedDocs, searchQuery); List<RatedRequest> ratedRequests = Arrays.asList(ratedRequest); RankEvalSpec specification = new RankEvalSpec(ratedRequests, metric); RankEvalRequest request = new RankEvalRequest(specification, new String[] { "posts" });
Synchronous Execution
editThe rankEval
method executes `RankEvalRequest`s synchronously:
RankEvalResponse response = client.rankEval(request, RequestOptions.DEFAULT);
Asynchronous Execution
editThe rankEvalAsync
method executes RankEvalRequest`s asynchronously,
calling the provided `ActionListener
when the response is ready.
The asynchronous method does not block and returns immediately. Once it is
completed the ActionListener
is called back using the onResponse
method
if the execution successfully completed or using the onFailure
method if
it failed.
A typical listener for RankEvalResponse
looks like:
RankEvalResponse
editThe RankEvalResponse
that is returned by executing the request
contains information about the overall evaluation score, the
scores of each individual search request in the set of queries and
detailed information about search hits and details about the metric
calculation per partial result.
double evaluationResult = response.getMetricScore(); assertEquals(1.0 / 3.0, evaluationResult, 0.0); Map<String, EvalQueryQuality> partialResults = response.getPartialResults(); EvalQueryQuality evalQuality = partialResults.get("kimchy_query"); assertEquals("kimchy_query", evalQuality.getId()); double qualityLevel = evalQuality.metricScore(); assertEquals(1.0 / 3.0, qualityLevel, 0.0); List<RatedSearchHit> hitsAndRatings = evalQuality.getHitsAndRatings(); RatedSearchHit ratedSearchHit = hitsAndRatings.get(2); assertEquals("3", ratedSearchHit.getSearchHit().getId()); assertFalse(ratedSearchHit.getRating().isPresent()); MetricDetail metricDetails = evalQuality.getMetricDetails(); String metricName = metricDetails.getMetricName(); assertEquals(PrecisionAtK.NAME, metricName); PrecisionAtK.Detail detail = (PrecisionAtK.Detail) metricDetails; assertEquals(1, detail.getRelevantRetrieved()); assertEquals(3, detail.getRetrieved());
The overall evaluation result |
|
Partial results that are keyed by their query id |
|
The metric score for each partial result |
|
Rated search hits contain a fully fledged |
|
Rated search hits also contain an |
|
Metric details are named after the metric used in the request |
|
After casting to the metric used in the request, the metric details offers insight into parts of the metric calculation |