Search API Reference
editSearch API Reference
editThe Custom search experiences guide provides a conceptual walkthrough of the steps involved in issuing search requests on behalf of users via OAuth.
In this API reference
edit- Search API Overview
- Querying
- Paginating
- Sorting results
- Restricting search fields
- Restricting result fields
- Filtering a query with a value
- Filtering a query with a numerical range
- Filtering a query with geolocation data
- Combining filters
- Configuring value facet options
- Configuring range facet options
- Configuring geolocation facet options
- Configuring a value boost
- Configuring a functional boost
- Configuring a proximity boost
- Configuring a recency boost
- Configuring automatic query refinements
Search API Overview
editPOST http://localhost:3002/api/ws/v1/search
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
The search query |
|
optional |
Provides optional keys of size and current. Specifies the number of results per page and which page the API should return. |
|
optional |
Sort results ASC or DESC for a field |
|
optional |
Fields used for full-text matching |
|
optional |
Fields returned in the JSON response |
|
optional |
Query modifiers used to refine a query |
|
optional |
Faceting configuration |
|
optional |
Whether query refinements are applied |
|
optional |
Kind of data sources to retrieve data from |
|
optional |
Timeout to use when retrieving data from remote sources |
Search Query
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
A string or number used to find related documents |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "denali" }'
Pagination
editIf source_type
is set to anything but standard
, page.current
is always set to 1
, and page.size
to 100
.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Provides optional keys of size and current. Specifies the number of results per page and which page the API should return. |
|
optional |
Specifies the number of results per page. Must be greater than or equal to |
|
optional |
Specifies which page of results to retrieve for the query. Must be greater than or equal to |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "denali", "page": { "size": 1, "current": 1 } }'
Limit on results per query
editAs described in Pagination, Workplace Search limits page.size
to 100
and page.current
to 100
.
Therefore, Workplace Search effectively limits each search query to ten thousand (10000) results.
Work around this limitation by requesting fewer documents. Divide a large set of documents into smaller sets by filtering, perhaps by content source, type, or timestamps (e.g. updated, created). Choose filters that create sets smaller than 10000.
Sorting
editIf source_type
is set to anything but standard
, sorting parameter is ignored, and results are sorted by score
.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Sort results ASC or DESC for one or multiple field |
|
field |
Name of the field used for sorting |
|
direction |
ASC or DESC direction for sorting |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "denali", "sort": [ { "square_km": "desc" }, { "date_established": "asc" } ] }'
Search Fields
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Fields used for full-text matching |
|
field |
Name of the field used for full-text matching |
|
integer |
The relative importance of fields in a query. A higher value represents greater importance for algorithmic scoring. |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "denali", "search_fields": { "title": { "weight": 10 }, "description": {} } }'
Result Fields
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Fields returned in the JSON response |
|
field |
Children of the |
|
result type |
Value of the field as originally indexed |
|
result type |
Value of the field with highlighting markup added to visually distinguish where the match occurred |
|
integer |
Character length of the returned value |
|
boolean |
For |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "america", "result_fields": { "title": { "raw": {}, "snippet": {} }, "description": { "raw": { "size": 50 }, "snippet": { "fallback": true, "size": 50 } }, "states": { "snippet": { "size": 50 } } } }'
{ "meta": { ... }, "results": [ { "title": { "raw": "American Samoa", "snippet": "<em>American</em> Samoa" }, "_meta": { "source": "custom", "last_updated": "2020-03-27T20:10:33+00:00", "content_source_id": "5e7e5d911897c6dbb7e3e72a", "id": "park_american-samoa", "score": 6.359234 }, "source": { "raw": "custom" }, "states": { "snippet": "<em>American</em> Samoa" }, "description": { "raw": "The southernmost National Park is on three Samoan", "snippet": "The southernmost National Park is on three Samoan" }, "last_updated": { "raw": "2020-03-27T20:10:33+00:00" }, "content_source_id": { "raw": "5e7e5d911897c6dbb7e3e72a" }, "id": { "raw": "park_american-samoa" } }, { "title": { "raw": "Denali", "snippet": null }, "_meta": { "source": "custom", "last_updated": "2020-03-27T20:10:33+00:00", "content_source_id": "5e7e5d911897c6dbb7e3e72a", "id": "park_denali", "score": 6.357545 }, "source": { "raw": "custom" }, "states": { "snippet": null }, "description": { "raw": "Centered on Denali, the tallest mountain in North", "snippet": " <em>America</em>, Denali is serviced by a single road" }, "last_updated": { "raw": "2020-03-27T20:10:33+00:00" }, "content_source_id": { "raw": "5e7e5d911897c6dbb7e3e72a" }, "id": { "raw": "park_denali" } }, ... ] }
Filters
editFilters are disabled if source_type
is set to anything but standard
. The only possible filter is a value
filter on content_source_id
.
Value Filters
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Query modifiers used to refine a query |
|
field |
Name of field upon which to apply your filter |
|
field value |
The value upon which to filter. The value must be an exact match, even casing: |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "filters": { "states": ["California", "Washington"] } }'
Range Filters
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Query modifiers used to refine a query |
|
field |
Name of field upon which to apply your filter |
|
optional |
Inclusive lower bound of the range. Is required if |
|
optional |
Exclusive upper bound of the range. Is required if |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "filters": { "visitors": { "from": 100 } } }'
Geo Filters
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Query modifiers used to refine a query |
|
field |
Name of field upon which to apply your filter |
|
required |
The mode of the distribution, specified as a latitude-longitude pair. See Geolocation fields. |
|
required |
The base unit of measurement: |
|
optional |
A number representing the distance unit. Is required if |
|
optional |
Inclusive lower bound of the range. Is required if |
|
optional |
Exclusive upper bound of the range. Is required if |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "filters": { "location": { "unit": "km", "center": "47.6062,122.3321", "distance": 1000 } } }'
Combining Filters
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Query modifiers used to refine a query |
|
array |
All of the filters must match. This functions as an |
|
array |
At least one of the filters must match. This functions as an |
|
array |
All of the filters must not match. This functions as a |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "filters": { "all": [ { "states": ["California", "Washington"] }, { "visitors": { "from": 100 } }, { "date_established": { "to": "1900-01-01" } } ], "any": [ { "location": { "unit": "km", "center": "47.6062,122.3321", "distance": 1000 } }, { "location": { "unit": "km", "center": "37.7749,122.4194", "from": 100, "to": 10000 } } ], "none": { "world_heritage_site": "true" } } }'
Facets
editFacets allow you to add certain aggregated data to search results. They are disabled if source_type
is set to anything but standard
.
Value Facets
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
field |
Name of field upon which to apply your facet |
|
required |
Type of facet, in this case |
|
optional |
Name given to facet |
|
optional |
Between 1 and 250, defaults to 10 |
|
optional |
JSON object where the key is either |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "facets": { "states": { "type": "value", "name": "top-five-states", "sort": { "count": "desc" }, "size": 5 } } }'
{ "meta": { ... }, "results": [ ... ], "facets": { "states": [ { "type": "value", "data": [ { "value": "Alaska", "count": 5 }, { "value": "Utah", "count": 2 }, { "value": "Colorado", "count": 2 }, { "value": "California", "count": 2 }, { "value": "Washington", "count": 1 } ], "name": "top-five-states" } ], } }
Range Facets
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
field |
Name of field upon which to apply your facet |
|
required |
Type of facet, in this case |
|
optional |
Name given to facet |
|
optional |
An array of range objects |
|
optional |
Inclusive lower bound of the range. Is required if |
|
optional |
Exclusive upper bound of the range. Is required if |
|
optional |
Name given to range |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "facets": { "acres": [ { "type": "range", "name": "min-and-max-range", "ranges": [ { "from": 1, "to": 10000 }, { "from": 10000 } ] } ], "date_established": { "type": "range", "name": "half-century", "ranges": [ { "from": "1900-01-01T12:00:00+00:00", "to": "1950-01-01T00:00:00+00:00" } ] } } }'
{ "meta": { ... }, "results": [ ... ], "facets": { "acres": [ { "type": "range", "name": "min-and-max-range", "data": [ { "to": 10000.0, "from": 1.0, "count": 2 }, { "from": 10000.0, "count": 57 } ] } ], "date_established": [ { "type": "range", "name": "half-century", "data": [ { "to": "1950-01-01T00:00:00.000Z", "from": "1900-01-01T12:00:00.000Z", "count": 24 } ] } ] } }
Geolocation Facets
edit
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
field |
Name of field upon which to apply your facet |
|
required |
Type of facet, in this case |
|
optional |
Name given to facet |
|
optional |
An array of range objects |
|
optional |
Inclusive lower bound of the range. Is required if |
|
optional |
Exclusive upper bound of the range. Is required if |
|
required |
The mode of the distribution, specified as a latitude-longitude pair. See Geolocation fields. |
|
required |
The base unit of measurement: |
|
optional |
Name given to range |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "facets": { "location": { "type": "range", "name": "geo-range-from-san-francisco", "center": "37.386483, -122.083842", "unit": "m", "ranges": [ { "from": 0, "to": 100000, "name": "Nearby" }, { "from": 100000, "to": 300000, "name": "A longer drive." }, { "from": 300000, "name": "Perhaps fly?" } ] } } }'
{ "meta": { ... }, "results": [ ... ], "facets": { "location": [ { "type": "range", "data": [ { "key": "Nearby", "from": 0, "to": 100000, "count": 0 }, { "key": "A longer drive.", "from": 100000, "to": 300000, "count": 0 }, { "key": "Perhaps fly?", "from": 300000, "count": 20 } ], "name": "geo-range-from-san-francisco" } ] } }
Boosts
editBoosting allows you to control the relevance of a document based on criteria for the value of a field (or fields) within a document.
Different boosts are applied to different field types.
-
Value boosts:
text
,number
,date
-
Functional boosts:
number
-
Proximity Boosts:
number
,location
-
Recency boosts:
date
Boosts are disabled if source_type
is set to anything but standard
.
The general format for a single boost looks like so:
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "boosts": { "(field)": { (boost parameters) }, "(field_2)": ..., ... } }'
The general format for multiple boosts looks like so:
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "boosts": { "(field)": [ { (boost parameter 1) }, { (boost parameter 2) }, ... ] } }'
Value boosts
editA value boost will boost the score of a document based on a direct value match. Available on text, number, and date fields. A document’s overall score will only be boosted once.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
required |
Type of boost. For a value boost, this should be |
|
required |
The value to exact match on, or use an array to match on multiple values. |
|
optional |
The arithmetic operation used to combine the original document score with your boost value. Can be |
|
optional |
Factor to alter the impact of a boost on the score of a document. Must be between |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "", "sort": { "_score": "desc" }, "boosts": { "states": { "type": "value", "value": "California", "operation": "multiply", "factor": 2 } }, "result_fields": { "title": { "raw": {} }, "visitors": { "raw": {} }, "states": { "raw": {} } }, "page": { "size": 100 } }'
{ "meta": { ... }, "results": [ { "title": { "raw": "Channel Islands" }, "_meta": { "source": "custom", "last_updated": "2020-10-22T17:20:58+00:00", "content_source_id": "5f91bf7888f9297b02a3d80c", "id": "park_channel-islands", "score": 2.0 }, "source": { "raw": "custom" }, "states": { "raw": [ "California" ] }, "last_updated": { "raw": "2020-10-22T17:20:58+00:00" }, "visitors": { "raw": 364807.0 }, "content_source_id": { "raw": "5f91bf7888f9297b02a3d80c" }, "id": { "raw": "park_channel-islands" } }, { "title": { "raw": "Death Valley" }, "_meta": { "source": "custom", "last_updated": "2020-10-22T17:20:58+00:00", "content_source_id": "5f91bf7888f9297b02a3d80c", "id": "park_death-valley", "score": 2.0 }, "source": { "raw": "custom" }, "states": { "raw": [ "California", "Nevada" ] }, "last_updated": { "raw": "2020-10-22T17:20:58+00:00" }, "visitors": { "raw": 1296283.0 }, "content_source_id": { "raw": "5f91bf7888f9297b02a3d80c" }, "id": { "raw": "park_death-valley" } }, ... ] }
Functional boosts
editA functional boost will apply a function to the overall document score based on the value of the numeric field. Only available on number fields.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
required |
Type of boost. For a functional boost, this should be |
|
required |
Type of function to calculate the boost value. Can be |
|
optional |
The arithmetic operation used to combine the original document score with your boost value. Can be |
|
optional |
Factor to alter the impact of a boost on the score of a document. Must be between |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "", "filters": { "states": ["California"] }, "sort": { "_score": "desc" }, "boosts": { "visitors": { "type": "functional", "function": "linear", "operation": "multiply", "factor": 2 } }, "result_fields": { "title": { "raw": {} }, "visitors": { "raw": {} } } }'
{ "meta": { ... }, "results": [ { "title": { "raw": "Yosemite" }, "_meta": { "source": "custom", "last_updated": "2020-10-22T17:20:58+00:00", "content_source_id": "5f91bf7888f9297b02a3d80c", "id": "park_yosemite", "score": 1.0057736E7 }, "source": { "raw": "custom" }, "last_updated": { "raw": "2020-10-22T17:20:58+00:00" }, "visitors": { "raw": 5028868.0 }, "content_source_id": { "raw": "5f91bf7888f9297b02a3d80c" }, "id": { "raw": "park_yosemite" } }, { "title": { "raw": "Joshua Tree" }, "_meta": { "source": "custom", "last_updated": "2020-10-22T17:20:58+00:00", "content_source_id": "5f91bf7888f9297b02a3d80c", "id": "park_joshua-tree", "score": 5010572.0 }, "source": { "raw": "custom" }, "last_updated": { "raw": "2020-10-22T17:20:58+00:00" }, "visitors": { "raw": 2505286.0 }, "content_source_id": { "raw": "5f91bf7888f9297b02a3d80c" }, "id": { "raw": "park_joshua-tree" } }, ... ] }
Proximity boosts
editBoost on the difference of a document value and a given value from the center parameter. Available on number and geolocation fields.
For date fields see recency boosts.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration. |
|
required |
Type of boost. For a proximity boost, this should be |
|
required |
Type of function to calculate the boost value. Can be |
|
required |
The mode of the distribution. Should be a number or a set of geolocation coordinates, like |
|
optional |
Factor to alter the impact of a boost on the score of a document. Must be between |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "boosts": { "location": { "type": "proximity", "function": "linear", "center": "25.32, -80.93", "factor": 8 } }, "query": "old growth" }'
Recency boosts
editA proximity boost, but with a timeframe as the center instead of a coordinate. Recency boosts are syntactically the same as proximity boosts, however they exclusively operate on a date/time field. In addition, the value for these boosts can use the keyword "now" to specify the center, or origin, of the boost function as the current date/time.
Only applies to date fields.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Faceting configuration |
|
required |
Type of boost. For a recency boost, this should be |
|
required |
Type of function to calculate the boost value. Can be |
|
required |
Provide a time-frame. Consider using |
|
optional |
Factor to alter the impact of a boost on the score of a document. Must be between |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "boosts": { "date_established": { "type": "proximity", "function": "linear", "center": "1974-01-13T05:15:12.65Z", "factor": 8 } }, "query": "old growth" }'
Automatic query refinements
editAutomatic query refinements are on by default when source_type is set to standard
. Otherwise they are disabled.
To disable automatic query refinements, change the boolean value on the automatic_query_refinement
top-level field:
{ … "automatic_query_refinement": {true|false, default: true} … }
|
optional |
Can be |
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "documents updated last week", "automatic_query_refinement": true }'
{ “meta”: { … “query_refinements”: “submitted_query”: “”, “decorated_query_html”: “<span class=\"tag tag-filter active highlighted\">documents</span> updated <span class=\"tag tag-filter active highlighted\">last week</span>” “refinements”: [ { “term”: “documents”, ‘position”: [0,8], “trigger_type”: “filter”, “trigger_filter_type”: “static”, “filter”: { “mime_type”: ["application/iwork-keynote-sffkey", "application/iwork-numbers-sffnumbers",… (full list abbreviated here for clarity) … ]” } }, { “term”: “updated last week”, “position”: [10,27], “trigger_type”: “filter”, “trigger_filter_type”: “static”, “filter”: { “last_updated”: { “from”: "2020-04-10" } } } ] } … }, … }
A breakdown of the response fields might help you parse and action what you receive:
|
The actual query that is submitted to the underlying Elasticsearch instance. May be transformed based on filter settings. |
|
The query with the triggered terms and phrases highlighted. You might use this to display the query via HTML with stylistic decoration. |
|
Metadata about each filter or query refinement that was created. Includes the term or phrase, the start and end character position in the original query, the filter type, and the actual filter that was built. You can re-use the returned filter fields in new search queries as they appear. |
|
Can be: (1) |
Source type
editThis is used to specify the type of content sources to search. Allowed values are standard
, remote
or all
. Defaults to standard
.
The content of a standard
source is stored within Workplace Search and is queryable without any external API calls to the original
source endpoint.
A remote
source is one that relies on the source’s search endpoint API to search content at the time when the query is
issued. All remote
sources are private sources and private sources for remote content must be
explicitly enabled to be made available for the credentialed user to query.
See the Content Sources Overview for a listing of content sources and which ones are considered remote
.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Type of content sources to search |
If the source_type
is set to remote
or all
, options such as filters, boost, facets and automatic query refinements are disabled
as not all remote
search endpoints can support these features.
Example:
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "quick brown fox", "source_type": "all" }'
Timeout
editTimeout in milliseconds to use when searching remote
or all
sources. Default is 10000
. If the timeout is reached, or the searched completed
within the timeout, any results that were able to be gathered will be returned.
|
required |
Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow. |
|
optional |
Timeout for remote sources |
Example:
curl -X POST http://localhost:3002/api/ws/v1/search \ -H "Authorization: Bearer $ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "quick brown fox", "source_type": "remote", "timeout": 5000 }'