Connecting custom sources

edit

Connecting custom sources

edit

Workplace Search provides a variety of popular content sources like GitHub, Google Drive, Salesforce, and Dropbox. It also offers a powerful and straightforward way to connect data sources for which a first-party integration does not exist, whether it is a modern homegrown data platform, a legacy system, or anything in-between. Enter custom sources and their extensive set of indexing and synchronization APIs.

This guide aims at guiding you through the high-level setup of custom sources, from data ingestion to tuning and display configuration:

A more concise and technical API reference can be found at Custom sources indexing API reference.

Understanding custom source basics

edit

Let’s start with a few key points:

  1. You can create any number of custom sources.
  2. Custom sources are created as individual containers, for which a unique schema and relevance attributes can be defined.
  3. Custom sources support any type of data, so long as it adheres to a schema of your choosing and some basic stylistic parameters.
  4. When configured correctly, custom sources inherit many of the features first-party source integrations do: automatic filtering at query time, source prioritization, customizable display of results, and more.

Creating a custom source

edit

A Custom Source is a first-class content source. It can be access controlled like any other content source.


Step 1. From the Workplace Search administrative dashboard’s Sources area, click Add a Shared Content Source, and select Custom API Source.


Step 2. Name the new source you want to create. The name will appear as-is in the various search experiences and management interfaces. Click Continue.


Step 3. It’s that simple, the source has now been created with the appropriate source identification (CONTENT_SOURCE_ID).

Figure 8. Creating a custom source

With this new source created, we can now prepare it for incoming data. The following examples use a bearer token for API access, but other options are available. See API Authentication Reference.

Managing a custom source’s schema

edit

Every Custom Source has its own unique schema, allowing you to create document repositories that truly represent the nature of the information you want your team to access via Workplace Search.

There are two ways to approach schema creating with Workplace Search. You may either manually generate a schema from the Schema configuration interface for the Custom Source, or Workplace Search can generate a schema automatically for you as you index new content into the source. Any new field added to a document will be picked up as an addition to the source’s schema, and all fields added via the document method will be created as a text field. This can be updated later, so long that the field type selected abides by the type’s requirements (i.e. a date in the correct format).

To manually configure the schema for a source, or to update an existing schema field:


Step 1. From the Workplace Search administrative dashboard’s Sources area, locate the Custom Source you would like to configure, and click Details.


Step 2. Navigate to the Schema view from the sidebar menu.


Step 3. If a schema exists, fields will appear with their associated field type values. If you are creating a new schema, the Add Field action should be presented to you:

Figure 9. Configuring the schema

Step 4. Name the field and set its type accordingly. Click Add Field.


Step 5. Repeat the process for all fields you’d like to manually predefine.

For more information on schema configuration, including reserved fields, naming convention, and field types, refer to the Custom sources indexing API reference.

Managing document-level permissions for custom sources

edit

Custom sources allow you to define at the document-level which user may or may not access the result as part of the search experience. Two reserved fields (_allow_permissions and _deny_permissions) accept array-type values. Using proper user mapping, you can generate sophisticated document access controls.

Deny permissions take precedence.

Read more in the Document permissions for custom sources guide.

Synchronizing data via custom sources

edit

Data ingestion with custom sources is straightforward. You may choose to push data to a custom source using any programming framework or language, but Workplace Search provides official API clients for several programming languages.

See Programming language clients in the Enterprise Search documentation.

Indexing a Document

edit

To index a document, simply issue a POST request to the Custom Source’s unique API endpoint, using both the CONTENT_SOURCE_ID in the request path, and a valid TOKEN in the authorization header. (See API Authentication Reference for coverage of token types and other options for API access.)

curl -X POST <ENTERPRISE_SEARCH_BASE_URL>/api/ws/v1/sources/[CONTENT_SOURCE_ID]/documents/bulk_create \
-H "Authorization: Bearer [TOKEN]" \
-H "Content-Type: application/json" \
-d '[
  {
    "_allow_permissions": ["permission1"],
    "_deny_permissions": [],
    "id" : 1234,
    "title" : "The Meaning of Time",
    "body" : "Not much. It is a made up thing.",
    "url" : "https://example.com/meaning/of/time",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
  {
    "_allow_permissions": [],
    "_deny_permissions": ["permission2"],
    "id" : 1235,
    "title" : "The Meaning of Sleep",
    "body" : "Rest, recharge, and connect to the Ether.",
    "url" : "https://example.com/meaning/of/sleep",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
  {
    "_allow_permissions": ["permission1"],
    "_deny_permissions": ["permission2"],
    "id" : 1236,
    "title" : "The Meaning of Life",
    "body" : "Be excellent to each other.",
    "url" : "https://example.com/meaning/of/life",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  }
]'

It’s best to think of these objects as symbols, which represent the greater document to which they link. Fields should be descriptive enough that your users will be able to find what they need given loose querying attempts.

The id field is vital for keeping your custom source up to date as it will be used to update and delete any given document over time.

Each document that is indexed will implicitly receive a field called updated_at. This field will track the time at which the document was indexed. You are unable to use this field name in your own documents.

For more information on document manipulation and deletion, check out the Custom sources indexing API reference.

Configuring display settings for a custom source

edit

Display Settings bring your documents to life with proper information architecture and evocative color schemes. To create a search experience that’s both relevant and easy to use, Workplace Search lets you adjust the visual representation of two types of results and their details view:

Standard Result: a result as it appears on the main results page Featured Result: usually a direct match on a search term, with high precision Details View: the panel that appears upon clicking a result, which provides a quick view of the document and may provide enough information from the search experience itself

The Display Settings area gives you control over 9 main components for results:

  1. Title: The result’s headline
  2. URL: The result’s target (usually the document’s location)
  3. Color: The color used to identify the results across the source
  4. Subtitle: Optional, often used for status or type
  5. Description: Optional, often used to display an excerpt of the document’s content
  6. Type: Optional, used to indicate the document type
  7. Media Type: Optional, used to indicate the document media type
  8. Created By: Optional, used to indicate the document creator
  9. Updated By: Optional, used to indicate the document modifier
Figure 10. Custom source result display settings

By default, the fields will be populated in alphabetical order according to schema.

The Display Settings area also allows you to add and reorder fields to the Details view.

Figure 11. Custom source result display settings

Recommended field names

edit

Each content source can have a unique schema. However, using consistent fields across content sources improves the search experience for users, particularly when using customized faceted filters and automatic query refinement. Therefore, when designing your schema, consider using the following common fields names where applicable:

General fields

edit
  • body
  • comments
  • description
  • tags
  • title
  • type
  • url

File/Document fields

edit
  • extension
  • mime_type
  • path
  • size

People/Human fields

edit
  • assigned_to
  • created_by
  • created_by_email
  • shared_by
  • shared_by_email
  • updated_by
  • updated_by_email

Issue tracking fields

edit
  • impact
  • priority
  • state
  • status
  • urgency