Customizing content source filters

edit

Customizing content source filters

edit

In this guide we’ll walk through enabling filters on a custom API source, and then hide a filter on a first-party source. You can also enable additional filters on first-party sources. Filters controlled through facet configurations will not affect the Search API, which provides its own mechanism for controlling filters. However, automatic query refinement configurations will affect both the native search experience as well as the Search API.

Customizing filters currently relies on the Workplace Search API. The steps in these guides assume that you are familiar with authenticating to the API. We’ll be making use of admin user access tokens in these examples.

On this page:

Add a new filter

edit

Here we’ll enable faceted filters on a custom API source. These steps assume you have already created a custom API source and indexed some documents. Here is what our search experience looks like:

We’ll add a filter for "State". The documents we have already indexed in this example generally look like:

{
  "description": "Covering most of Mount Desert Island and [...]",
  "nps_link": "https://www.nps.gov/acad/index.htm",
  "states": [
    "Maine"
  ],
  "title": "Acadia",
  "id": "park_acadia",
  "visitors": 3303393,
  "world_heritage_site": false,
  "location": "44.35,-68.21",
  "acres": 49057.36,
  "square_km": 198.5,
  "date_established": "1919-02-26T06:00:00Z"
}

Step 1. First, we need to retrieve the custom API source’s definition. We’ll retrieve it from the API with the following cURL command:

curl \
--request "GET" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources" \
--header "Authorization: Bearer ${ACCESS_TOKEN}"

Which gives us a response of:

{
  "meta": { ... },
  "results":
  [
    {
      "id": "60dde90aa1c4934782d5b939",
      "service_type": "custom",
      ...
    }
  ]
}

In the above response data, the custom API source we’re looking for has an id of 60dde90aa1c4934782d5b939. You can find name values in the source definitions to help identify which source you’re looking for.


Step 2. Next, we need to identify the name of the field we want to enable as a faceted field. Using this example of a document indexed for these instructions:

{
  "description": "Covering most of Mount Desert Island and [...]",
  "nps_link": "https://www.nps.gov/acad/index.htm",
  "states": [
    "Maine"
  ],
  "title": "Acadia",
  "id": "park_acadia",
  "visitors": 3303393,
  "world_heritage_site": false,
  "location": "44.35,-68.21",
  "acres": 49057.36,
  "square_km": 198.5,
  "date_established": "1919-02-26T06:00:00Z"
}

We can see that the field containing state data is called states.


Step 3. Then, we need to choose a display label for our new filter. For this example, we’ll use State.


Step 4. We can then use the source definition from Step 1 to build a facet configuration request. In this request, we’re using the custom API source id in the --url value as CONTENT_SOURCE_ID, and we’re using the custom API source definition as our --data content. The facets key/value pair has been moved to the beginning of the json body to highlight what has been changed:

curl \
--request "PUT" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources/${CONTENT_SOURCE_ID}" \
--header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header "Content-Type: application/json" \
--data '
{
  "facets":
  {
    "overrides":
    [
      {
        "field": "states",
        "enabled": true,
        "display_name": "State"
      }
    ]
  },
  "id": "60dde90aa1c4934782d5b939",
  "service_type": "custom",
  "created_at": "2021-07-01T16:10:50+00:00",
  "last_updated_at": "2021-07-01T16:10:53+00:00",
  "is_remote": false,
  "details":
  [],
  "groups":
  [
    {
        "id": "60b7ab98a1c4935a50c94864",
        "name": "Default"
    },
    {
        "id": "60dde915a1c493e7c239e083",
        "name": "Bootstrapped Users"
    }
  ],
  "name": "National Parks",
  "context": "organization",
  "is_searchable": true,
  "schema":
  {
    "description": "text",
    "title": "text",
    "nps_link": "text",
    "states": "text",
    "visitors": "number",
    "square_km": "number",
    "world_heritage_site": "text",
    "date_established": "date",
    "location": "geolocation",
    "acres": "number"
  },
  "display":
  {
    "title_field": "title",
    "subtitle_field": "states",
    "description_field": "description",
    "url_field": "nps_link",
    "type_field": null,
    "media_type_field": null,
    "created_by_field": null,
    "updated_by_field": null,
    "detail_fields":
    [
      {
          "field_name": "states",
          "label": "States"
      },
      {
          "field_name": "description",
          "label": "Description"
      },
      {
          "field_name": "date_established",
          "label": "Established"
      },
      {
          "field_name": "visitors",
          "label": "Number of Visitors"
      },
      {
          "field_name": "location",
          "label": "Location"
      },
      {
          "field_name": "world_heritage_site",
          "label": "World Heritage Site?"
      },
      {
          "field_name": "square_km",
          "label": "Square Kilometers"
      },
      {
          "field_name": "acres",
          "label": "Acres"
      }
    ],
    "color": "#000000"
  },
  "document_count": 59,
  "last_indexed_at": "2021-07-01T16:10:53+00:00"
}
'

Step 5. Now we can see the search experience includes a filter for State. Note that you generally need to filter on a specific source to expose filters for that source (in this case the National Parks source).

Enabling automatic query refinement

edit

Content source automatic query refinements define a sort of domain-specific language that allows a searcher to limit results using natural language queries. When a query contains a configured query_expansion_phrases object followed by a specific value from the matching field, only documents with that value are found.

For example, you may want to enable automatic query refinement for this field, so that the user does not need to manually check the box for State: Alaska when looking for parks only in Alaska.

To do this:

  • Ensure the index contains documents with the states: Alaska (we saw this in the above facets).
  • Configure the field states to have query_expansion_phrases of ['inside', 'in'].
curl \
--request "PUT" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources/${CONTENT_SOURCE_ID}" \
--header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header "Content-Type: application/json" \
--data '
{
  "facets":
  {
    "overrides":
    [
      {
        "field": "states",
        "enabled": true,
        "display_name": "State"
      }
    ]
  },
  "automatic_query_refinement":
  {
    "overrides":
    [
      {
        "field": "states",
        "enabled": true,
        "is_person": false,
        "query_expansion_phrases":
        [
          "in",
          "inside"
        ]
      }
    ]
  },
  "id": "60dde90aa1c4934782d5b939",
  "service_type": "custom",
  "created_at": "2021-07-01T16:10:50+00:00",
  "last_updated_at": "2021-07-01T16:10:53+00:00",
  "is_remote": false,
  "details":
  [],
  "groups":
  [
    {
        "id": "60b7ab98a1c4935a50c94864",
        "name": "Default"
    },
    {
        "id": "60dde915a1c493e7c239e083",
        "name": "Bootstrapped Users"
    }
  ],
  "name": "National Parks",
  "context": "organization",
  "is_searchable": true,
  "schema":
  {
    "description": "text",
    "title": "text",
    "nps_link": "text",
    "states": "text",
    "visitors": "number",
    "square_km": "number",
    "world_heritage_site": "text",
    "date_established": "date",
    "location": "geolocation",
    "acres": "number"
  },
  "display":
  {
    "title_field": "title",
    "subtitle_field": "states",
    "description_field": "description",
    "url_field": "nps_link",
    "type_field": null,
    "media_type_field": null,
    "created_by_field": null,
    "updated_by_field": null,
    "detail_fields":
    [
      {
          "field_name": "states",
          "label": "States"
      },
      {
          "field_name": "description",
          "label": "Description"
      },
      {
          "field_name": "date_established",
          "label": "Established"
      },
      {
          "field_name": "visitors",
          "label": "Number of Visitors"
      },
      {
          "field_name": "location",
          "label": "Location"
      },
      {
          "field_name": "world_heritage_site",
          "label": "World Heritage Site?"
      },
      {
          "field_name": "square_km",
          "label": "Square Kilometers"
      },
      {
          "field_name": "acres",
          "label": "Acres"
      }
    ],
    "color": "#000000"
  },
  "document_count": 59,
  "last_indexed_at": "2021-07-01T16:10:53+00:00"
}
'

Now when your users issue queries like mountains in Alaska or mountains inside California, the state in their search box will highlight blue, and the corresponding facet will automatically be checked.

Disable an existing filter

edit

Here we’ll hide a filter on a first-party source. These steps assume you have already connected a source. This example will involve a Dropbox source with a number of test documents. Here’s the search experience without any modifications. Note that the search is filtering on Dropbox only to make use of Dropbox filters:

We might decide that the Media Type and Extension filters are redundant, so we’d like to hide the Extension filter.


Step 1. Using the same request as our custom API source example above, we’ll first request all sources to retrieve the source definition of the Dropbox source:

curl \
--request "GET" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources" \
--header "Authorization: Bearer ${ACCESS_TOKEN}"

This gives us a response of:

{
  "meta": { ... },
  "results":
  [
    {
      "id": "60dde90aa1c4934782d5b939",
      "service_type": "custom",
       ...
    },
    {
      "id": "60de02d9a1c4934b6efe24db",
      "service_type": "dropbox",
      ...
    }
  ]
}

In the above response data, the Dropbox source we’re looking for with a service_type of Dropbox has an id of 60de02d9a1c4934b6efe24db.


Step 2. Next, we can then use the source definition from previous step to build a facet configuration request to disable the filter on the extension field. You can find default filter field names in each of the source’s individual guides (example: Dropbox fields). In this request, we’re using the source’s definition from the previous response as our --data content. However, note that in addition to moving the facets and automatic_query_refinement key/value pairs to the beginning of the json body to highlight what has been changed, the schema and display fields have been removed. You cannot change schema and display for sources over this API endpoint and so they are ignored if present (some sources like Custom API and Salesforce do allow settings these fields over the API):

curl \
--request "PUT" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources/${CONTENT_SOURCE_ID}" \
--header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header "Content-Type: application/json" \
--data '
{
  "facets":
  {
    "overrides":
    [
      {
        "field": "extension",
        "enabled": false
      }
    ]
  },
  "automatic_query_refinement":
  {
    "overrides":
    [
      {
        "field": "extension",
        "enabled": false,
        "is_person": false,
        "query_expansion_phrases": []
      }
    ]
  },
  "id": "60de02d9a1c4934b6efe24db",
  "service_type": "dropbox",
  "created_at": "2021-07-01T18:00:57+00:00",
  "last_updated_at": "2021-07-01T18:00:59+00:00",
  "is_remote": false,
  "details":
  [
      {
          "title": "Email",
          "description": "swiftype-eng@elastic.co"
      }
  ],
  "groups":
  [
    {
        "id": "60b7ab98a1c4935a50c94864",
        "name": "Default"
    }
  ],
  "name": "Dropbox",
  "context": "organization",
  "is_searchable": true,
  "document_count": 74,
  "last_indexed_at": null
}
'

Step 3. Finally, we can see the search experience hides the Extension filter:

Re-enable a disabled filter

edit

If you want to re-enable the Extension filter from the above steps, it’s natural to think that you should just change the enabled attribute from false to true. However, you’re really expressing the facet configuration override. The Dropbox source already has an Extension filter by default, so update the source with an empty array for overrides in order to re-enable the Extension filter:

curl \
--request "PUT" \
--url "${ENTERPRISE_SEARCH_URL}/api/ws/v1/sources/${CONTENT_SOURCE_ID}" \
--header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header "Content-Type: application/json" \
--data '
{
  "facets":
  {
    "overrides": []
  },
  "automatic_query_refinement":
  {
    "overrides": []
  },
  "id": "60de02d9a1c4934b6efe24db",
  "service_type": "dropbox",
  "created_at": "2021-07-01T18:00:57+00:00",
  "last_updated_at": "2021-07-01T18:00:59+00:00",
  "is_remote": false,
  "details":
  [
      {
          "title": "Email",
          "description": "swiftype-eng@elastic.co"
      }
  ],
  "groups":
  [
    {
        "id": "60b7ab98a1c4935a50c94864",
        "name": "Default"
    }
  ],
  "name": "Dropbox",
  "context": "organization",
  "is_searchable": true,
  "document_count": 74,
  "last_indexed_at": null
}
'

Finally, although facet configurations allow for a display_name value, renaming a default filter is not currently supported.