Custom sources indexing API reference
editCustom sources indexing API reference
editThis is a technical API reference. Refer to the Custom API sources for a conceptual walkthrough.
In this API reference
editCustom source API endpoints and operations
editThe Custom Source API supports traditional RESTful operations:
Creating or updating a document:
POST /api/ws/v1/sources/[ID]/documents/bulk_create
POST /api/ws/v1/sources/[ID]/documents/bulk_destroy
Authenticating requests with the custom source API
editWorkplace Search APIs support multiple methods of authentication.
For simplicity, the examples from this page use admin auth tokens.
The auth_token
is your private API key. The id
is used to identify which Custom Source for which documents will be indexed, updated or deleted.
curl -X POST http://localhost:3002/api/ws/v1/sources/[ID]/documents/bulk_create \ -H "Authorization: Bearer [AUTH_TOKEN]" \ -H "Content-Type: application/json" \ -d ' ... '
Create a new Custom Source or navigate to the Details area of an existing Custom Source from the Workplace Search administrative dashboard to locate the access token and content source ID.
The auth_token
is shared amongst all custom sources. The id
value is a unique identifier for each Custom Source.
Schema management and configuration
editEvery Custom Source has its own unique schema, allowing you to create document repositories that truly represent the nature of the information you want your team to access via Workplace Search. Read the Custom API sources guide for a walkthrough of the process.
The following guidelines may help you create a maintainable and scalable schema:
- A Custom Source schema can be configured with up to 64 fields.
-
Always index new fields as the same type as existing documents.
-
eg. An existing
date
field should not receivegeolocation
data.
-
eg. An existing
- Arrays are supported, but nested field objects are not supported.
- Fields cannot be deleted once they have been created.
-
Reserved fields can not be created:
-
external_id
-
source
-
content_source_id
-
updated_at
-
last_updated
-
highlight
-
any
,all
,none
,or
,and
,not
-
engine_id
-
_allow_permissions
and_deny_permissions
-
- A field name can only contain lowercase letters, numbers, and underscores.
Schema data types
editCustom Source fields can be one of four different types:
text
Fields
editText fields are at the heart of search. They are analyzed fields and are used for full-text matching in information retrieval. Any group of characters or text that you want to search over should be text.
Example: A description of an object, the name of a product, the content of a review.
text
is the default type for all new fields.
number
Fields
editnumber
fields represent a single-precision, floating-point value (32 bits): 3.14
or 42
.
Number fields enable fine grained sorting, filtering, faceting, and boosting.
(If you need to represent a larger number, consider a text
field as a workaround.)
Example: A price, a review score, the number of visitors, or a size.
date
Fields
editDates must be in ISO 8601 format, i.e. "2013-02-27T18:09:19Z"
or "2013-02-27T17:09:19+01:00"
.
Example: A product release or publish date, birth date, an air date.
geolocation
Fields
editGeolocation fields are latitude-longitude pairs, representing locations.
Examples: A store where a product is located; the location of a venue.
Specify a geolocation using any of the following formats:
"location": "41.12,-71.34" "location": "drm3btev3e86" "location": [ -71.34, 41.12 ] "location" : "POINT (-71.34 41.12)"
Geo-point expressed as a string with the format: |
|
Geo-point expressed as a geohash |
|
Geo-point expressed as an array with the format: [ |
|
Geo-point expressed as a well-known text POINT with the format: |
For more details, see Geo-point field type in the Elasticsearch documentation. However, be aware Enterprise Search supports fewer formats than Elasticsearch. Enterprise Search supports only the formats shown above.
Indexing and updating documents
editIndex new objects into a Custom Source or update existing documents.
Request limits: Maximum 100 documents per request
POST /api/ws/v1/sources/[ID]/documents/bulk_create
|
required |
Unique ID for a Custom Source, provided upon creation of a Custom Source. |
|
required |
Must be included in HTTP authorization headers. |
|
optional |
ID unique to a document used to identify, modify or delete the record at a later time. If you do not provide an |
|
optional |
The date and time the document is last updated. It will default to the current date and time only for the creation of a document, when it’s not provided. |
|
optional |
Optional for document-level permissions. When a value is set within a document, only users with a matching permission will be able to view it. |
|
optional |
Optional for document-level permissions. When a value is set within a document, users with the matching permission will be unable to view it. Read the Document permissions for custom sources to learn more. |
curl -X POST http://localhost:3002/api/ws/v1/sources/[ID]/documents/bulk_create \ -H "Authorization: Bearer [AUTH_TOKEN]" \ -H "Content-Type: application/json" \ -d '[ { "_allow_permissions": ["permission1"], "_deny_permissions": [], "id" : 1234, "title" : "The Meaning of Time", "body" : "Not much. It is a made up thing.", "url" : "https://example.com/meaning/of/time", "created_at": "2019-06-01T12:00:00+00:00", "type": "list" }, { "_allow_permissions": [], "_deny_permissions": ["permission2"], "id" : 1235, "title" : "The Meaning of Sleep", "body" : "Rest, recharge, and connect to the Ether.", "url" : "https://example.com/meaning/of/sleep", "created_at": "2019-06-01T12:00:00+00:00", "type": "list" }, { "_allow_permissions": ["permission1"], "_deny_permissions": ["permission2"], "id" : 1236, "title" : "The Meaning of Life", "body" : "Be excellent to each other.", "url" : "https://example.com/meaning/of/life", "created_at": "2019-06-01T12:00:00+00:00", "type": "list" } ]'
{ "results": [ { "id": "1234", "errors": [] }, { "id": "1235", "errors": [] }, { "id": "1236", "errors": [] } ] }
Deleting documents
editDeleting documents by ID
editRemove documents by ID from a Custom Source.
POST /api/ws/v1/sources/[ID]/documents/bulk_destroy
|
required |
Unique ID for a Custom source, provided upon creation of a Custom Source. |
|
required |
Must be included in HTTP authorization headers. |
|
required |
An array of IDs associated to documents to delete. |
curl -X POST http://localhost:3002/api/ws/v1/sources/[ID]/documents/bulk_destroy \ -H "Authorization: Bearer [AUTH_TOKEN]" \ -H "Content-Type: application/json" \ -d '[ [DOCUMENT_ID_1], [DOCUMENT_ID_2] ]'
{ results: [ { "id":1234, "success":true }, { "id":1235, "success":true } ] }
Deleting documents by query
editRemove documents by query from a Custom Source.
DELETE /api/ws/v1/sources/[ID]/documents
|
required |
Unique ID for a Custom source, provided upon creation of a Custom Source. |
|
required |
Must be included in HTTP authorization headers. |
|
optional |
Query modifiers used to refine a query. A request without |
|
field |
Name of field upon which to apply your filter. Only |
|
optional |
Inclusive lower bound of the range. Is required if |
|
optional |
Exclusive upper bound of the range. Is required if |
curl -X DELETE http://localhost:3002/api/ws/v1/sources/[ID]/documents \ -H "Authorization: Bearer [AUTH_TOKEN]" -d ' { "filters" : { "last_updated" : { "from": "2020-06-01T12:00:00+00:00" } } } '
{ "total" : 234, "deleted" : 234, "failures" : [] }
Understanding document IDs
editEach document within a content source must have a unique id
. If you do not provide an id
, a BSON id
will be created for you. Two documents in two separate content sources may have the same id
.
You can update existing documents by issuing a POST request to an existing id
.
If the id
does not exist, a new document is created. It is up to you to maintain the integrity of your id
for each document within each Custom API Source.
We recommend that you avoid SHAs or any identifier derived from the content of a document. Any modification of the original data will alter the value, making it difficult to identify the document in the search index. This can lead to record duplication.
Synchronizing document-level permissions for custom sources
editCustom sources allow you to define at the document-level which user may or may not access the result as part of the search experience. Two reserved fields (_allow_permissions
and _deny_permissions
) accept array-type values. Using proper user mapping, you can generate sophisticated document access controls.
Deny permissions take precedence.
Read more in the Document permissions for custom sources guide.