Documents API
editDocuments API
editLooking for a guided introduction to documents?
See Indexing Documents.
Did you know you can use the web crawler to index web content?
See Web crawler.
Create (index), update, delete, and display documents.
Authentication
editFor authentication, the Documents endpoint requires...
-
The name of your Engine:
[ENGINE]
-
A Private API Key:
[PRIVATE_API_KEY]
curl -X GET '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/[ENGINE]/documents' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer [PRIVATE_API_KEY]'
Create or Update Documents
editDocuments are JSON objects with up to 64 key-value pairs. The key is the field name and the value is the content.
Requests, when successful, return a 200
response code.
This indicates that the batch has been received, and indexing will begin.
Documents are indexed asynchronously. There will be a processing delay before they are available to your Engine.
Key points to remember when creating documents:
- Documents are sent via an array and are independently accepted and indexed, or rejected.
-
A
200
response and an emptyerrors
array denotes a successful index. -
If no
id
is provided, a uniqueid
will be generated. -
A document is created each time content is received without an
id
- beware duplicates! -
A document will be updated - not created - if its
id
already exists within a document. - If the Engine has not seen the field before, then it will create a new field of type text.
- There is a 100 document per request limit; each document must be less than 100kb.
- An indexing request may not exceed 10mb.
Field name rules
editField names:
- Must contain a lowercase letter and may only contain lowercase letters, numbers, and underscores.
- Must not contain whitespace or have a leading underscore.
- Must not contain more than 64 characters.
-
Must not be any of the following reserved words:
-
id
-
engine_id
-
search_index_id
-
highlight
-
any
-
all
-
none
-
or
-
and
-
not
-
For schema design principles, consult the API Overview.
POST <ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/{ENGINE_NAME}/documents
Example - A POST
request adding three documents to the national-parks-demo
Engine.
curl -X POST '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx' \ -d '[ { "description": "Death Valley is the hottest, lowest, and driest place in the United States. Daytime temperatures have topped 130 °F (54 °C) and it is home to Badwater Basin, the lowest elevation in North America. The park contains canyons, badlands, sand dunes, and mountain ranges, while more than 1000 species of plants grow in this geologic graben. Additional points of interest include salt flats, historic mines, and springs.", "nps_link": "https://www.nps.gov/deva/index.htm", "states": [ "California", "Nevada" ], "title": "Death Valley", "visitors": "1296283", "world_heritage_site": "false", "location": "36.24,-116.82", "acres": "3373063.14", "square_km": "13650.3", "date_established": "1994-10-31T06:00:00Z", "id": "park_death-valley" } ]'
Example Response
[ { "id": "park_death-valley", "errors": [] } ]
Partial Update (PATCH)
editUpdate specific document fields by id
and field
.
The id
is required and new fields cannot be created using PATCH
!
PATCH <ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/{ENGINE_NAME}/documents
Example - A PATCH
providing partial updates to 3 actual documents and attempting to update 3 erroneous documents.
curl -X PATCH '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx' \ -d '[ { "id": "park_yosemite" }, { "title": "Everglades" }, { "id": "park_wind-cave", "date_established": "1903-01-09T06:00:00Z" } ]'
Example Response
[ { "id": "park_yosemite", "errors": [] }, { "id": "", "errors": [ "Missing required key 'id'" ] }, { "id": "park_wind-cave", "errors": [] } ]
Delete Documents
editDelete documents by id
.
Returns an array of JSON objects indicating the deleted status for each document.
DELETE <ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/{ENGINE_NAME}/documents
Example - Using DELETE
to remove a pair of real documents and one example erroneous document.
curl -X DELETE '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx' \ -d '["park_zion", "park_yosemite", "does_not_exist"]'
Example Response
[ { "id": "park_zion", "deleted": true }, { "id": "park_yosmite", "deleted": true }, { "id": "does_not_exist", "deleted": false } ]
Get Documents
editRetrieves one or more documents by id
.
All field values are returned in string
format.
To return documents in their raw format, use the Search API.
A null
response will appear in the event that a requested document could not be found.
GET <ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/{ENGINE_NAME}/documents
JSON Object
editA paginated array of JSON objects representing documents.
Example - Looking for two specific ids
using JSON objects within the request body: "park_zion"
and "does_not_exist"
.
curl -X GET '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx' \ -d '["park_zion", "does_not_exist"]'
Example Response
[ { "description": "Located at the junction of the Colorado Plateau, Great Basin, and Mojave Desert, this park contains sandstone features such as mesas, rock towers, and canyons, including the Virgin River Narrows. The various sandstone formations and the forks of the Virgin River create a wilderness divided into four ecosystems: desert, riparian, woodland, and coniferous forest.", "nps_link": "https://www.nps.gov/zion/index.htm", "states": [ "Utah" ], "title": "Zion", "visitors": "4295127", "world_heritage_site": "false", "location": "37.3,-113.05", "acres": "147237.02", "square_km": "595.8", "date_established": "1919-11-19T06:00:00Z", "id": "park_zion" }, null ]
Query Parameters
editA parameterized query.
Example - Looking for two specific ids
using query parameters: "park_zion"
and "does_not_exist"
.
curl -X GET '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents?ids%5B%5D=park_zion&ids%5B%5D=does_not_exist' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx'
Example Response
[ { "description": "Located at the junction of the Colorado Plateau, Great Basin, and Mojave Desert, this park contains sandstone features such as mesas, rock towers, and canyons, including the Virgin River Narrows. The various sandstone formations and the forks of the Virgin River create a wilderness divided into four ecosystems: desert, riparian, woodland, and coniferous forest.", "nps_link": "https://www.nps.gov/zion/index.htm", "states": [ "Utah" ], "title": "Zion", "visitors": "4295127", "world_heritage_site": "false", "location": "37.3,-113.05", "acres": "147237.02", "square_km": "595.8", "date_established": "1919-11-19T06:00:00Z", "id": "park_zion" }, null ]
List Documents
editLists up to 10,000 documents.
All field values are returned in string
format.
To return documents in their raw format, use the Search API.
GET <ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/{ENGINE_NAME}/documents/list
-
page
(optional) - JSON object containing current and size, where current is the current page number and size is the page size. The maximum for size is 100, and be will truncated if a larger size is requested. The default is the first page of documents with pagination at 100.
You have two options as to how you might send in your parameters:
JSON Object
editA paginated array of JSON objects representing documents.
Example - Using JSON objects within the request body to see the second page of results, with a page size of 15 results per page.
curl -X GET '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents/list' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx' \ -d '{ "page": { "current": 2, "size": 15 } }'
Example Response
{ "meta": { "page": { "current": 2, "total_pages": 4, "total_results": 59, "size": 15 } }, "results": [ { "description": "Death Valley is the hottest, lowest, and driest place in the United States. Daytime temperatures have topped 130 °F (54 °C) and it is home to Badwater Basin, the lowest elevation in North America. The park contains canyons, badlands, sand dunes, and mountain ranges, while more than 1000 species of plants grow in this geologic graben. Additional points of interest include salt flats, historic mines, and springs.", "nps_link": "https://www.nps.gov/deva/index.htm", "states": [ "California", "Nevada" ], "title": "Death Valley", "visitors": "1296283", "world_heritage_site": "false", "location": "36.24,-116.82", "acres": "3373063.14", "square_km": "13650.3", "date_established": "1994-10-31T06:00:00Z", "id": "park_death-valley" }, { "description": "Centered on Denali, the tallest mountain in North America, Denali is serviced by a single road leading to Wonder Lake. Denali and other peaks of the Alaska Range are covered with long glaciers and boreal forest. Wildlife includes grizzly bears, Dall sheep, caribou, and gray wolves.", "nps_link": "https://www.nps.gov/dena/index.htm", "states": [ "Alaska" ], "title": "Denali", "visitors": "587412", "world_heritage_site": "false", "location": "63.33,-150.5", "acres": "4740911.16", "square_km": "19185.8", "date_established": "1917-02-26T06:00:00Z", "id": "park_denali" }, # ... Truncated! ] }
Query Parameters
editA parameterized query.
Example - Using Query Parameters to see the second page of results, with a page size of 15 results per page.
curl -X GET '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/documents/list?page[size]=15&page[current]=2' \ -H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxx'
Example Response
{ "meta": { "page": { "current": 2, "total_pages": 4, "total_results": 59, "size": 15 } }, "results": [ { "description": "Death Valley is the hottest, lowest, and driest place in the United States. Daytime temperatures have topped 130 °F (54 °C) and it is home to Badwater Basin, the lowest elevation in North America. The park contains canyons, badlands, sand dunes, and mountain ranges, while more than 1000 species of plants grow in this geologic graben. Additional points of interest include salt flats, historic mines, and springs.", "nps_link": "https://www.nps.gov/deva/index.htm", "states": [ "California", "Nevada" ], "title": "Death Valley", "visitors": "1296283", "world_heritage_site": "false", "location": "36.24,-116.82", "acres": "3373063.14", "square_km": "13650.3", "date_established": "1994-10-31T06:00:00Z", "id": "park_death-valley" }, { "description": "Centered on Denali, the tallest mountain in North America, Denali is serviced by a single road leading to Wonder Lake. Denali and other peaks of the Alaska Range are covered with long glaciers and boreal forest. Wildlife includes grizzly bears, Dall sheep, caribou, and gray wolves.", "nps_link": "https://www.nps.gov/dena/index.htm", "states": [ "Alaska" ], "title": "Denali", "visitors": "587412", "world_heritage_site": "false", "location": "63.33,-150.5", "acres": "4740911.16", "square_km": "19185.8", "date_established": "1917-02-26T06:00:00Z", "id": "park_denali" }, # ... Truncated! ] }
Errors
editAfter making the request, a 200
will be returned assuming the request is not malformed.
To determine the status of the request, the errors
array will describe any issues:
// The response code will be 200! [ { "id": null, "errors": [ "Exceeds maximum allowed document size of 102400 bytes" ] } ]
Errors by Request Type
editEach request type has a different indication that something is amiss.
POST and PATCH
editA nested errors
array is included as part of each document object...
[ { "id": "123456", "errors": [] } ]
The array may include one or more of the following errors:
Error Message |
Solution |
"Request exceeds maximum allowed limit of 100 documents" |
Only 100 documents may be added per request. If you need to add a batch, consider writing a script that sends documents in sets of 100. |
"Exceeds maximum allowed document size of 102400 bytes" |
Each document must be less than 102400 bytes or 102.4 kilobytes. The larger the request, the more time it shall take. Consider breaking |
"JSON must be an array or hash" |
Unstructured JSON data is not permitted.
Ensure you have encapsulated your object in an array To pass an array within query parameters, use multiple parameters with the same name followed by
However, you must URL-encode the query string. The following example is the query string from above, encoded:
|
"Invalid field type: id must not be blank" |
Any field type may be blank except for the |
"Invalid field type: id must be less than 800 characters" |
The |
"Name can only contain lowercase letters, numbers, and underscores" |
Any field type may be blank except for the |
"Invalid field value: Value [VALUE] cannot be parsed as a float" |
When a field type is set to number, the value must be a single-precision, floating-point value (32 bits). (If you need to represent a larger number, consider a |
"Invalid field value: Value [VALUE] cannot be parsed as a date (RFC 3339)" |
When a field type is set to date, the value string must resemble a string formatted to RFC3339. You may omit time, but the date at minimum must be provided: YYYY-MM-DD. |
"Invalid field value: Value [VALUE] cannot be parsed as a location" |
When a field type is set to geolocation, the value string must be formatted according to lat/long: |
"Can only update existing fields and |
A |
"Missing required key id" |
The document |
DELETE
editThe deleted
field indicates request status: true
for success and false
for failure.
{ "id": "does_not_exist", "deleted": false }
GET
editA null
value will be returned within the response array when a document could not be found:
[ { "description": "Located at the junction of the Colorado Plateau, Great Basin, and Mojave Desert, this park contains sandstone features such as mesas, rock towers, and canyons, including the Virgin River Narrows. The various sandstone formations and the forks of the Virgin River create a wilderness divided into four ecosystems: desert, riparian, woodland, and coniferous forest.", "nps_link": "https://www.nps.gov/zion/index.htm", "states": [ "Utah" ], "title": "Zion", "visitors": "4295127", "world_heritage_site": "false", "location": "37.3,-113.05", "acres": "147237.02", "square_km": "595.8", "date_established": "1919-11-19T06:00:00Z", "id": "park_zion" }, null ]
What’s Next?
editYou should have a solid grip on indexing basics: adding data to your Engine, indexing and then building a schema. You likely already know how to create, adjust, or destroy your Engines, but if not then that is a great next step. Otherwise, Search is where you should venture. Alternatively, if you want to adjust your Schema on account of new Field Types, that would be a valuable information.