Connecting SharePoint Online

edit

Connecting SharePoint Online

edit

You can deploy and run this connector on your own infrastructure using the SharePoint Online connector package.

SharePoint Online by Microsoft is a cloud-based collaboration, knowledge management and storage platform for organizations of all sizes. Often used as a centralized content management system (CMS), SharePoint Online stores a wealth of information across departments and teams. The SharePoint Online connector provided with Workplace Search automatically captures, syncs and indexes the following items:

Stored Files

Including ID, File Metadata, File Content, Updated by, and timestamps

Known issues

edit
  1. When configured after November 8, 2020, the Sharepoint Online connector must utilize an application set up by an Azure AD admin, with granted Admin Consent. Therefore, private sources are not supported. Organization sources are supported when connected by an Azure AD admin user, or when the Admin Consent or Admin Consent Workflows are enabled. Refer to the official Microsoft documentation for an overview of User and Admin consent.

    During configuration, you register an OAuth app in Azure AD that does not have a verified publisher. After November 8, 2020, these apps can be connected by Azure AD admin users only.

  2. The connecting Azure AD user must have permission to access the sites for any groups and teams that Azure AD user is able to query. Without access to these sites, the sync may fail with a 403 error.

Note about SharePoint Teams-connected Sites

A SharePoint site is a web site in SharePoint where you can create web pages and store and collaborate on files. SharePoint sites can be used independently and are also used by Teams for file storage (these are called Teams-connected sites). A Teams-connected site is created automatically whenever you create a team.

When using the Workplace Search Sharepoint Online connector, you must ensure that the SharePoint Online service account has sufficient permissions to all group and team sites you need to sync.

Note the following details:

  • Any sharepoint.address.com/sites/SitePath type implicitly grants access to Azure Global Admin and Sharepoint Admin.
  • The sharepoint.address.com/teams/SitePath type does not.

Refer to the official Microsoft documentation for an overview of how Teams and SharePoint integrate.

Configuring the SharePoint Online Connector

edit

You must configure the SharePoint Online connector before connecting the SharePoint Online service to Workplace Search. For this you must create an OAuth App in the SharePoint Online platform.

To get started, first log in to SharePoint Online and access your administrative dashboard.

Ensure you are logged in as the Azure Portal service account for this guide.


Step 1. Sign in to https://portal.azure.com/ and click on Azure Active Directory:

Figure 91. Connecting SharePoint

Step 2. Locate App Registrations:

Figure 92. Connecting SharePoint

Step 3. Click New Registration:

Figure 93. Connecting SharePoint

Step 4. Give your app a name - like "Workplace Search" - and make it multitenant.

Setting the app to single tenant will result in a degraded experience, and the connector will not sync content.

Leave the Redirect URIs blank for now. We’ll add this later in the process.


Step 5. Register the application:

Figure 94. Connecting SharePoint

Step 6. Retrieve and keep the Client ID handy - we’ll need it within Workplace Search.


Step 7. Next, click the Add a Redirect URI link in the header.

Use the Workplace Search OAuth redirect URL for your deployment.

Figure 95. Connecting SharePoint

Step 7. Save the configuration.

Figure 96. Connecting SharePoint

Step 8. Locate the Client Secret by navigating to Certificates & Secrets:

Figure 97. Connecting SharePoint

Step 9. Pick a name for your client secret (for example, Workplace Search). Select 24 months as the expiration date:

Figure 98. Connecting SharePoint

Step 10. Save the Client Secret value before leaving this screen.

Figure 99. Connecting SharePoint

Step 11. We must now set up the permissions the Application will request from the Azure Portal service account. Navigate to API Permissions and click Add Permission. Add delegated permissions until the list resembles the following:

User.ReadBasic.All
Group.Read.All
Directory.AccessAsUser.All
Files.Read
Files.Read.All
Sites.Read.All
offline_access
Figure 100. Connecting SharePoint

Step 12. Finally, Grant admin consent.

Use the Grant Admin Consent link from the permissions screen.


Step 13. From the Workplace Search administrative dashboard’s Sources area, locate SharePoint Online, click Configure and provide both the Client ID and Client Secret.

Voilà! The SharePoint Online connector is now configured, and ready to be used to synchronize content. In order to capture data, you must now connect a SharePoint Online instance with the adequate authentication credentials.

Connecting SharePoint Online to Workplace Search

edit

Once the SharePoint Online connector has been configured, you may connect a SharePoint Online instance to your organization.


Step 1. Head to your organization’s Workplace Search administrative dashboard, and locate the Sources tab.


Step 2. Click Add a new source.


Step 3. Select SharePoint Online in the Configured Sources list, and follow the SharePoint Online authentication flow as presented.

Ensure you are logged in as the Azure Portal service account for this step.

You might be following the OAuth flow in a browser where you’re already logged into the Azure Portal as a personal user. If in doubt, open an incognito window and log in as the Azure Portal service account.


Step 4. Upon the successful authentication flow, you will be redirected to Workplace Search.

SharePoint Online content will now be captured and will be ready for search gradually as it is synced. Once successfully configured and connected, the SharePoint Online synchronization automatically occurs every 2 hours.

Document-level permissions

edit

You can synchronize document access permissions from SharePoint Online to Workplace Search. This will ensure the right people see the right documents.

See Document-level permissions for Microsoft.

Limiting the content to be indexed

edit

If you don’t need to index all the available content, you can specify the indexing rules via the API. This will help shorten indexing times and limit the size of the index. See Customizing indexing. For SharePoint Online, applicable rule types would be path_template and file_extension.

When writing path_template rules, note that site name must be prefixed in the path.

To include/exclude all contents from a site foo use a pattern like this:

/sites/foo/**/*

To include/exclude all contents from folder bar in site foo use a pattern like this:

/sites/foo/bar/**/*

You should be able to infer the site name from the site URL. For example, for the URL http://{domain}/sites/foobar the site name is foobar. For the root site, the site name is the domain name.

Synchronized fields

edit

The following table lists the fields synchronized from the connected source to Workplace Search. The attributes in the table apply to the default search application, as follows:

  • Display name - The label used when displayed in the UI
  • Field name - The name of the underlying field attribute
  • Faceted filter - whether the field is a faceted filter by default, or can be enabled (see also: Customizing filters)
  • Automatic query refinement preceding phrases - The default list of phrases that must precede a value of this field in a search query in order to automatically trigger query refinement. If "None," a value from this field may trigger refinement regardless of where it is found in the query string. If '', a value from this field must be the first token(s) in the query string. If N.A., automatic query refinement is not available for this field by default. All fields that have a faceted filter (default or configurable) can also be configured for automatic query refinement; see also Update a content source, Get a content source’s automatic query refinement details and Customizing filters.
Display name Field name Faceted filter Automatic query refinement preceding phrases

Id

id

No

N.A.

URL

url

No

N.A.

Title

title

No

N.A.

Type

type

Default

None

Path

path

No

N.A.

Created by

created_by

Configurable

[creator is, created by, edited by, modified by]

Last updated

last_updated

No

N.A.

Updated by

updated_by

Configurable

[edited by, updated by, modified by]

Drive owner

drive_owner

Default

N.A.

Media type

mime_type

Default

None

Extension

extension

Default

None