Search guide
editSearch guide
editSearching is how you read data and generate results from the documents stored within your Engines.
The search endpoint will be invoked each time a search is performed by a user.
Unlike the other endpoints, which customize Engines, view analytics, tune relevance, or index documents, search is for, well... Searching!
You can use your Public Search Key or your Private API Key to query the search endpoint. The Public Search Key is for performing search using client side JavaScript, within mobile application development, or any other context where you are at risk of exposing your API Key.
The Public Search Key key begins with: search-
. The responses generated by the search endpoint should contain data from your engines that you want your users to see. That is why it has its own special, public key.
Or, instead read the search API reference to get coding.
What do I search?
editSearch is all about finding documents.
A document is an individual JSON object.
When you add data into your Engines, you are taking database or backend API objects and indexing them.
But what does it mean, to index?
To better understand this, we will look no further than a JSON object:
{ "name": "SuperObject", "authors": "Examplio McDemonstratio", "downloads": "98765", "info": "A stunning look at potential." }
That is one object. Your datastore will likely contain many.
During indexing, your set of objects, of documents, is evaluated to develop a schema.
The keys are translated into Fields.
The values are, by default, given the type of text.
Prior to your data being indexed, the schema is created:
Existing Field | Type |
---|---|
|
|
|
|
|
|
|
|
Each document that you index can now be considered part of a set of data.
This changes the objects in a small way:
{ "id": "1234", "name": "SuperObject", "authors": "Examplio McDemonstrate", "downloads": "98765", "info": "A stunning look at potential." }
Each one now contains an id
, this id
is how your documents are known to your Engines.
By belonging to a defined - though still flexible - schema, the Engine can get deep into its analysis of the documents.
The end result is the ability to search vast sets of objects, of documents, with great precision.
How do I search?
editIf you are the visitor, then you arrive at an application, seek out the search bar or search box, or magic box, enter some text, then hit enter. If search is fast and relevant, then they click on a result, their experience continues and they consume the content, purchase the project, or accomplish whatever it was they sent out to do.
If you are the developer, then you develop applications that access a robust set of APIs. You can access the APIs via an official programming language client, or you can weave in whatever sort of programmatic brilliance that you can dream up.
The visitor has it easy.
They are not aware of the deep logic that is happening below the interface with which they interact.
The act of searching, in its most simple form, is a request against an Engine wherein the document they seek is indexed.
Example - Performing a simple search.
curl -X POST 'https://[instance id].ent-search.[region].[provider].cloud.es.io:443/api/as/v1/engines/ruby-gems/search' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer search-qcqrj73hmom796c98r22zeao' \ -d '{ "query": "search" }'
It is a big difference between returning something and returning the right thing.
The name of the game is relevance.
When we make a request, the Engine will send a response object in return.
The response object contains a nested _meta
object with a score
key.
The value of the score
key is the relevance.
Consider that we host an application for finding useful RubyGems.
The best search Gem of all - somehow, even better than App Search - is Searcheror Supreme.
It is popular.
If we had basic search, a visitor querying for search
would return Searcheror Supreme as one of the last results.
That can lead to a poor experience - we want them to find the best Gem for their query.
The result of the search
query would be:
{ "name": { "raw": "Searcheror Supreme" }, "id": { "raw": "1334" }, "authors": { "raw": "Doctor Odd" }, "downloads": { "raw": "421321431" }, "info": { "raw": "A mind-bending experience of reclamation from the Ether. Not compatible with Dormammu Exiler." }, "_meta": { "score": 4.986149 } }
The relevance score
under the _meta
key is so low, a mere 5! Why is that?
By default, the Engine would take a uniform look across all fields for the keyword sent along with the query.
The word search
appears only once, as a part of the name Searcheror.
Despite being the most popular and relevant Gem, it is buried under results that use the word search
more often.
This is not ideal! But this is how most search functions work.
To go deeper, you must write sophisticated algorithms.
You would need to solve challenging data structure problems.
But you already have a million things to build.
Managing these deep search complexities is the problem solved by App Search.
There are many different endpoints to help you craft a fast, imaginative and useful search experience.
Explore the documentation to learn more.
What about result meta data?
editThe meta
field is an object containing information about the results.
See Response body within the search API reference for details.
Why search?
editThe Internet offers information.
A product, a thought — whatever it may be, search can get people to the things they want, quicker.
The faster they get there, the more likely they are to have an enjoyable experience and help you accomplish your business goals.
Where next?
editYou are now familiar with the basics of search. Next, you can apply powerful tools to provide a relevant and valuable search experience. If you want to start polishing up how results appear, improving relevance and meeting business goals in the process, consider reading about Curations and Relevance Tuning. If you want to see how your users search, dive into Analytics. For the nitty-gritty details on the Search API, we have the Search API Reference.
Or, if you’re looking for the solution to a specific search task, read on…
Search tasks
editThe following sections provide solutions to specific search tasks.
Paginate search results
editThe search API paginates results by default. However, with each request you can specify the number of results per page and the current page.
Refer to page.size
and page.current
within Search API, Request body for accepted values.
Additionally, avoid paging beyond 10,000 results. These requests return zero results.
The following example returns results 9001 through 10,000:
# request body { … "page": { "current": 10, "size": 1000 }, … } # response body { "meta": { … "page": { "current": 10, "total_pages": 10, "total_results": 10000, "size": 1000 }, … }, "results": [ … ] }
In contrast, requesting results 10,001 through 11,000 returns zero results:
# request body { … "page": { "current": 11, "size": 1000 }, … } # response body { "meta": { … "page": { "current": 11, "total_pages": 0, "total_results": 0, "size": 1000 }, … }, "results": [] }
Therefore, when displaying results to users, ensure your search requests limit paging to 10,000 results (e.g. the product of multiplying page.size
and page.current
must be less than or equal to 10000
).
Alternatively, if you are paging beyond 10,000 to collect the results of a search in memory or on disk, design multiple searches for the same results instead.
You can submit multiple searches in a single API request.
Filter each search by category
, date
, or some other field in your schema that you know will return fewer than 10,000 results.
Then, in your application code, concatenate the multiple result sets into a single set of results.
Display the total number of search results
editIf you are confident a search will never match more than 10,000 documents, use the value of meta.page.total_results
to display the total number of results within your UI.
See Search API, Response body.
# response body { "meta": { … "page": { "current": 10, "total_pages": 10, "total_results": 10000, "size": 1000 }, … }, … }
Beyond 10,000 matches, the value of meta.page.total_results
is fixed at 10000
.
If you can tolerate inexact values beyond 10,000, write a special case for this value in your application code.
For example, if meta.page.total_results
equals 10000
, display "10,000+ results" within your UI.
If you require exact counts over 10,000, use one of the solutions described in Count the documents within an engine.
Count the documents within an engine
editSince Enterprise Search 7.11.0, you can use the engines API to count the documents within an engine.
Responses from the API include the results.document_count
field, which reports the number of documents in the engine.
See Retrieve an Engine for examples.
Prior to version 7.11.0, use one of the following strategies to count the documents within an engine:
If you are confident your engine contains 10,000 or fewer documents, you can rely on meta.page.total_results
for an exact count of documents.
See Search API, Response body.
# request body { "query": "" } # response body { "meta": { … "page": { "current": 1, "total_pages": 10, "total_results": 100, "size": 10 }, … }, … }
If the engine may contain more than 10,000 documents, count the documents using a facet. Choose a facet that will match every document in the engine. The facet count is therefore the document count.
Refer to the following examples.
The documents in an engine represent the products of a retail catalog.
The quantity
field for each document cannot exceed 200
.
To match all documents, search for documents whose quantity
falls within the range 0
to 200
:
# request body { "query": "", "facets": { "quantity": [ { "type": "range", "ranges": [ { "from": 0, "to": 200 } ] } ] } } # response body { … "facets": { "quantity": [ { "type": "range", "data": [ { "from": 0, "to": 200, "count": 11000 } ] } ] } }
Each document in your engine has a created_at
timestamp.
To match all documents, search for an exceptionally wide range of values within this field.
# request body { "query": "", "facets": { "created_at": [ { "type": "range", "ranges": [ { "from": "1900-01-01T00:00:00+00:00", "to": "2100-01-01T00:00:00+00:00" } ] } ] } } # response body { … "facets": { "create_at": [ { "type": "range", "data": [ { "from": "1900-01-01T00:00:00.000Z", "to": "2100-01-01T00:00:00.000Z", "count": 11000 } ] } ] } }
If you plan for this use case up front, you can index a field for this specific purpose. Ensure every document has an identical field with an identical value.
For example, index every document with a field named indexed
with the value true
.
Then search for matching documents to receive an exact count:
Submit multiple searches in a single API request
editUse the multi search API. Construct each search as you would for the search API.
You will receive a separate set of results for each search, within a single response.