Configuration
editConfiguration
editEnterprise Search requires little configuration to get started. However, for flexibility, the solution provides many configurable settings.
This document explains how to modify Enterprise Search configuration settings. It also provides a reference for each configuration setting and the configuration settings format.
Configure Enterprise Search
editConfigure Enterprise Search by setting the values of various configuration settings. All deployments use the same configuration settings format, but access to the settings varies by deployment type.
Refer to the section for your deployment type:
Self-managed deployments can also set default values for some environment variables read by Enterprise Search.
Elastic Cloud
editConfigure Enterprise Search on Elastic Cloud using custom user settings.
See Add Enterprise Search user settings in the Elastic Cloud documentation.
Elastic Cloud Enterprise (ECE)
editConfigure Enterprise Search on Elastic Cloud Enterprise using custom user settings.
See Add Enterprise Search user settings in the Elastic Cloud Enterprise documentation.
Elastic Cloud on Kubernetes (ECK)
editConfigure Enterprise Search on Elastic Cloud on Kubernetes (ECK) by editing the YAML specification.
See Configuration in the Elastic Cloud on Kubernetes documentation.
Docker
editWhen running with docker
or docker-compose
, configure Enterprise Search using environment variables.
Refer to the following examples:
Tar, deb, and rpm packages
editWhen installed using a tar, deb, or rpm package, configure Enterprise Search using a configuration file.
The location of the configuration file varies by package type:
-
.tar
archives -
config/enterprise_search.yml
-
.deb
and.rpm
packages -
/usr/share/enterprise_search/config/enterprise_search.yml
Or, set the location of the configuration file using the following environment variable: ENT_SEARCH_CONFIG_PATH
.
Configuration settings format
editThe Enterprise Search configuration follows the YAML format.
You can nest multi-node configuration settings:
elasticsearch: host: http://127.0.0.1:9200 username: elastic password: changeme
Or you can flatten configuration settings:
elasticsearch.host: http://127.0.0.1:9200 elasticsearch.username: elastic elasticsearch.password: changeme
You can format non-scalar values as sequences:
secret_management.encryption_keys: - O9noPkMWqBTmae3hnvscNZnxXjDEl - 3D0LNI0iibBbjXhJGpx0lncGpwy0z
Or you can format non-scalar values as arrays:
secret_management.encryption_keys: ['O9noPkMWqBTmae3hnvscNZnxXjDEl', '3D0LNI0iibBbjXhJGpx0lncGpwy0z']
You can interpolate values from environment variables using ${}
:
secret_management.encryption_keys: [${KEY_1}, ${KEY_2}]
Configuration settings reference
editThe following settings are available to configure Enterprise Search.
Elastic Enterprise Search comes with reasonable defaults. Before adjusting the configuration, make sure you understand what you are trying to accomplish and the consequences.
For passwords, the use of environment variables is encouraged to keep values from being written to disk. For example: elasticsearch.password: ${ELASTICSEARCH_PASSWORD:changeme}
Secrets
edit-
secret_management.encryption_keys
-
Encryption keys to protect your application secrets. This field is required.
secret_management.encryption_keys: []
-
secret_management.enforce_valid_encryption_keys
-
Encryption keys are checked for validity when Enterprise Search starts, and will include a warning message in the logs in case they are not correct. This setting controls whether Enterprise Search will start when incorrect encryption keys are found on startup. When
true
, Enterprise Search will not start if encryption keys are not correctly configured. Defaults tofalse
.secret_management.enforce_valid_encryption_keys: false
Elasticsearch
edit-
allow_es_settings_modification
-
Enterprise Search needs one-time permission to alter Elasticsearch settings. Ensure the Elasticsearch settings are correct, then set the following to true. Or, adjust Elasticsearch’s config/elasticsearch.yml instead.
allow_es_settings_modification: false
-
elasticsearch.username
-
The username that the Enterprise Search server should use to make changes within Elasticsearch. For example, the Enterprise Search server uses this username to authenticate to Elasticsearch to create indices when needed.
The user must have adequate permission within Elasticsearch.
Alternatively, use a token for the Enterprise Search service account, which can be configured as
elasticsearch.service_account_token
.elasticsearch.username: elastic
-
elasticsearch.password
-
The password for the username provided in
elasticsearch.username
.elasticsearch.password: changeme
-
elasticsearch.service_account_token
-
Token for the Enterprise Search service account.
elasticsearch.service_account_token: XXXXXXXXXX
This token is used by the Enterprise Search server to authenticate to Elasticsearch when managing internal Enterprise Search indices.
A guide on how to generate a service account token for Enterprise Search can be found in the Elasticsearch documentation for Service Accounts.
If both the elasticsearch.service_account_token
and the Authorization
header in elasticsearch.headers
are present, then the elasticsearch.service_account_token
will take precedence.
-
elasticsearch.headers
-
Elasticsearch custom HTTP headers to add to each request.
elasticsearch.headers: 'X-My-Header: Contents of the header'
-
elasticsearch.ssl.enabled
-
SSL communication with Elasticsearch enabled or not.
elasticsearch.ssl.enabled: false
-
elasticsearch.ssl.certificate
-
Path to client certificate file to use for client-side validation from Elasticsearch.
-
elasticsearch.ssl.certificate_authority
-
Absolute pathname to the keystore that contains Certificate Authorities for Elasticsearch SSL certificate.
elasticsearch.ssl.certificate_authority: /path/elasticsearch/config/certs/http_ca.crt
-
elasticsearch.ssl.verify
-
true to verify SSL certificate from Elasticsearch, false otherwise.
elasticsearch.ssl.verify: true
-
elasticsearch.startup_retry.enabled
-
Elasticsearch startup retry.
elasticsearch.startup_retry.enabled: true
-
elasticsearch.startup_retry.interval
-
Elasticsearch startup retry.
elasticsearch.startup_retry.interval: 5 # seconds
-
elasticsearch.startup_retry.fail_after
-
Elasticsearch startup retry.
elasticsearch.startup_retry.fail_after: 600 # seconds
Kibana
edit-
kibana.host
-
Define the URL at which Enterprise Search can reach Kibana. Defaults to
http://localhost:5601
for testing purposes.kibana.host: http://localhost:5601
-
kibana.external_url
-
Define the exposed URL at which users can reach Kibana. Defaults to the value of
kibana.host
.
-
kibana.headers
-
Custom HTTP headers to add to requests made to Kibana from Enterprise Search.
kibana.headers: 'X-My-Header: Contents of the header'
-
kibana.startup_retry.fail_after
-
Kibana startup retry.
kibana.startup_retry.fail_after: 600 # seconds
Hosting and network
edit-
ent_search.external_url
-
Define the exposed URL at which users will reach Enterprise Search. Defaults to localhost:3002 for testing purposes. Most cases will use one of:
-
An IP:
http://255.255.255.255
-
A FQDN:
http://example.com
-
Shortname defined via /etc/hosts:
http://ent-search.search
ent_search.external_url: http://localhost:3002
-
An IP:
-
ent_search.listen_host
-
Web application listen_host. Your application will run on this host. Must be a valid IPv4 or IPv6 address.
ent_search.listen_host: 127.0.0.1
-
ent_search.listen_port
-
Web application listen_port. Your application will run on this host and port. Must be a valid port number (1-65535).
ent_search.listen_port: 3002
Limits
editConfigurable limits for Enterprise Search.
Overriding the default limits can impact performance negatively. Also, changing a limit here does not actually guarantee that Enterprise Search will work as expected as related Elasticsearch limits can be exceeded.
Workplace Search
edit-
workplace_search.content_source.total_fields.limit
-
Configure how many fields a content source can have.
The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.workplace_search.content_source.total_fields.limit: 64
-
workplace_search.content_source.sync.max_consecutive_errors
-
Configure how many errors in a row to tolerate in a sync job. If the job encounters more errors in a row than this value, the job will fail. This only applies to errors tied to individual documents.
workplace_search.content_source.sync.max_consecutive_errors: 10
-
workplace_search.content_source.sync.max_error_ratio
-
Configure the ratio of <errored documents> / <total documents> to tolerate in a sync job or in a rolling window (see
workplace_search.content_source.sync.error_ratio_window_size
). If the job encounters an error ratio greater than this value in a given window, or overall at the end of the job, the job will fail. This only applies to errors tied to individual documents.workplace_search.content_source.sync.max_error_ratio: 0.15
-
workplace_search.content_source.sync.thumbnails.enabled
-
Configure whether or not a content source should generate thumbnails for the documents it syncs. Not all file types/sizes/content or Content Sources support thumbnail generation, even if this is enabled.
workplace_search.content_source.sync.thumbnails.enabled: true
-
workplace_search.content_source.localhost_base_urls.enabled
-
Configure whether to allow localhost URLs as base URLs in content sources (by default, they are not allowed).
workplace_search.content_source.localhost_base_urls.enabled: false
-
workplace_search.content_source.external.unsafe_backend_allowed
-
Configure whether to allow unsafe HTTP backends for connectors (typically for localhost development). Defaults to
false
(HTTPS is enforced).workplace_search.content_source.external.unsafe_backend_allowed: true
App Search
edit-
app_search.engine.total_fields.limit
-
Configure how many fields an engine can have. The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.app_search.engine.total_fields.limit: 64
-
app_search.engine.query.limit
-
Configure how big full-text queries are allowed. The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.app_search.engine.query.limit: 128
Workers
editAPIs
edit-
hide_version_info
-
Set to true hide product version information from API responses.
hide_version_info: false
Mailer
edit-
email.account.enabled
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
email.account.enabled: false
-
email.account.smtp.auth
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
email.account.smtp.auth: plain
-
email.account.smtp.starttls.enable
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
email.account.smtp.starttls.enable: false
-
email.account.smtp.host
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
email.account.smtp.host: 127.0.0.1
-
email.account.smtp.port
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
email.account.smtp.port: 25
-
email.account.smtp.user
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
-
email.account.smtp.password
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
-
email.account.email_defaults.from
-
Connect Enterprise Search to a mailer. See Configuring a mail service.
Logging
editIn 7.x versions prior to 7.17.16 and 8.x versions prior to 8.11.2, Documents API logs the raw
content of indexed documents at the info
log level. Starting in 7.17.16+ for 7.x versions and 8.11.2+
for 8.x versions, it only logs the raw content of indexed documents at the debug
log level.
-
ilm.enabled
-
This setting is deprecated and ILM can no longer be disabled. The index lifecycle policies that Enterprise Search creates can be managed in Kibana. See the ILM documentation.
ilm.enabled: true
-
enable_stdout_app_logging
-
Enable logging app logs to stdout (enabled by default).
enable_stdout_app_logging: true
-
log_rotation.keep_files
-
The number of files to keep on disk when rotating logs. When set to 0, no rotation will take place.
log_rotation.keep_files: 7
-
log_rotation.rotate_every_bytes
-
The maximum file size in bytes before rotating the log file. If log_rotation.keep_files is set to 0, no rotation will take place and there will be no size limit for the singular log file.
log_rotation.rotate_every_bytes: 1048576 # 1 MiB
-
connector.crawler.logging.events.enabled
-
Enable or disable indexing of Elasticsearch Crawler Event logs. These are enabled by default. Disabling these will impact dashboards and analytics.
connector.crawler.logging.events.enabled: true
TLS/SSL
editSession
editAPM Instrumentation
edit-
apm.service_name
-
Override the APM service name. Allowed characters: a-z, A-Z, 0-9, -, _ and space.
apm.service_name: 'Enterprise Search'
Monitoring
editDiagnostics report
edit-
diagnostic_report_directory
-
Path where diagnostic reports will be generated.
diagnostic_report_directory: diagnostics
Elastic web crawler
editIf you are looking for the App Search web crawler configuration documentation, see the App Search web crawler configuration docs. To compare features with the App Search web crawler, see Web crawler.
-
connector.crawler.http.user_agent
-
The User-Agent HTTP Header used for the Elastic web crawler.
connector.crawler.http.user_agent: Elastic-Crawler (<crawler_version_number>)
When running Elastic Web Crawler on Elastic Cloud, the default user agent value is
Elastic-Crawler Elastic Cloud
(https://www.elastic.co/guide/en/cloud/current/ec-get-help.html; <unique identifier>)
.
-
connector.crawler.http.user_agent_platform
-
The user agent platform used for the Elastic web crawler with identifying information. See User-Agent - Syntax in the MDN web docs.
This value will be added as a suffix to
connector.crawler.http.user_agent
and used as the final User-Agent Header. This value is blank by default.
-
connector.crawler.workers.pool_size.limit
-
The number of parallel crawls allowed per instance of Enterprise Search. By default, it is set to 2x the number of available logical CPU cores. On Intel CPUs, the default value is 4x the number of physical CPU cores due to hyper-threading. See Hyper-threading on Wikipedia.
connector.crawler.workers.pool_size.limit: N
You cannot set connector.crawler.workers.pool_size.limit
to more than 8x the number of physical CPU cores
available for the Enterprise Search instance.
Keep in mind that despite the setting above, you can still only have one crawl request running per engine at a time.
Per-crawl Resource Limits
editThese limits guard against infinite loops and other traps common to production web crawlers. If your crawler is hitting these limits, try changing your crawl rules or the content you’re crawling. Adjust these limits as a last resort.
Advanced Per-crawl Limits
edit-
connector.crawler.crawl.threads.limit
-
The number of parallel threads to use for each crawl. The main effect from increasing this value will be an increased throughput of the Elastic web crawler at the expense of higher CPU load on Enterprise Search and Elasticsearch instances as well as higher load on the website being crawled.
connector.crawler.crawl.threads.limit: 10
-
connector.crawler.crawl.url_queue.url_count.limit
-
The maximum size of the crawl frontier - the list of URLs the Elastic web crawler needs to visit. The list is stored in Elasticsearch, so the limit could be increased as long as the Elasticsearch cluster has enough resources (disk space) to hold the queue index.
connector.crawler.crawl.url_queue.url_count.limit: 100000
Per-Request Timeout Limits
editPer-Request Resource Limits
editContent Extraction Resource Limits
editElastic web crawler HTTP Security Controls
edit-
connector.crawler.security.ssl.certificate_authorities
-
A list of custom SSL Certificate Authority certificates to be used for all connections made by the Elastic web crawler to your websites. These certificates are added to the standard list of CA certificates trusted by the JVM. Each item in this list could be a file name of a certificate in PEM format or a PEM-formatted certificate as a string.
connector.crawler.security.ssl.certificate_authorities: []
-
connector.crawler.security.ssl.verification_mode
-
Control SSL verification mode used by the Elastic web crawler:
-
full
- validate both the SSL certificate and the hostname presented by the server (this is the default and the recommended value) -
certificate
- only validate the SSL certificate presented by the server -
none
- disable SSL validation completely (this is very dangerous and should never be used in production deployments).connector.crawler.security.ssl.verification_mode: full
-
Enabling this setting could expose your Authorization headers to a man-in-the-middle attack and should never be used in production deployments. See https://en.wikipedia.org/wiki/Man-in-the-middle_attack for more details.
Elastic web crawler DNS Security Controls
editThe settings in this section could make your deployment vulnerable to SSRF attacks (especially in cloud environments) from the owners of any domains you crawl. Do not enable any of the settings here unless you fully control DNS domains you access with the Elastic web crawler. See Server Side Request Forgery on OWASP for more details on the SSRF attack and the risks associated with it.
-
connector.crawler.security.dns.allow_loopback_access
-
Allow the Elastic web crawler to access the localhost (127.0.0.0/8 IP namespace).
connector.crawler.security.dns.allow_loopback_access: false
-
connector.crawler.security.dns.allow_private_networks_access
-
Allow the Elastic web crawler to access the private IP space: link-local, network-local addresses, etc. See Reserved IP addresses - IPv4 on Wikipedia for more details.
connector.crawler.security.dns.allow_private_networks_access: false
Elastic web crawler HTTP proxy settings
editIf you need the Elastic web crawler to send HTTP requests through an HTTP proxy, use the following settings to provide the proxy information to Enterprise Search.
Your proxy connections are subject to the DNS security controls described in Elastic web crawler DNS Security Controls. If your proxy server is running on a private address or a loopback address, you will need to explicitly allow the Elastic web crawler to connect to it.
-
connector.crawler.http.proxy.host
-
The host of the proxy.
connector.crawler.http.proxy.host: example.com
-
connector.crawler.http.proxy.port
-
The port of the proxy.
connector.crawler.http.proxy.port: 8080
-
connector.crawler.http.proxy.protocol
-
The protocol to be used when connecting to the proxy:
http
(default) orhttps
.connector.crawler.http.proxy.protocol: http
-
connector.crawler.http.proxy.username
-
The username portion of the Basic HTTP credentials to be used when connecting to the proxy.
connector.crawler.http.proxy.username: kimchy
-
connector.crawler.http.proxy.password
-
The password portion of the Basic HTTP credentials to be used when connecting to the proxy.
connector.crawler.http.proxy.password: A3renEWhGVxgYFIqfPAV73ncUtPN1b
Advanced Elastic web crawler tuning
edit-
connector.crawler.http.compression.enabled
-
Enable/disable HTTP content (gzip/deflate) compression in Elastic web crawler requests.
connector.crawler.http.compression.enabled: true
-
connector.crawler.http.default_encoding
-
Default encoding used for responses that do not specify a charset.
connector.crawler.http.default_encoding: UTF-8
-
connector.crawler.http.head_requests.enabled
-
Enable/disable performing HEAD requests before GET requests when crawling websites. Enabling HEAD requests allows Crawler to decide whether or not to download a page based on its
content-type
header. This can speed up crawls for websites that contain many unindexable binary files. This setting is false by default.connector.crawler.http.head_requests.enabled: true
Read-only mode
edit-
skip_read_only_check
-
If true, pending migrations can be executed without enabling read-only mode. Proceeding with migrations while indices are allowing writes can have unintended consequences. Use at your own risk, should not be set to true when upgrading a production instance with ongoing traffic.
skip_read_only_check: false
Environment variables reference
editSelf-managed deployments can set default values for the following environment variables read by Enterprise Search.
Set these values within config/env.sh
.
-
JAVA_OPTS
-
Java options for JVM tuning (used for app-server and CLI commands).
export JAVA_OPTS=${JAVA_OPTS:-"-Xms2g -Xmx2g"}
-
APP_SERVER_JAVA_OPTS
-
Additional Java options for the application server.
export APP_SERVER_JAVA_OPTS="${APP_SERVER_JAVA_OPTS:-}"
-
JAVA_GC_LOGGING
-
Enable Java GC logging (see below for the default configuration).
export JAVA_GC_LOGGING=true
-
JAVA_GC_LOG_DIR
-
Where to put the files.
export JAVA_GC_LOG_DIR=log
-
JAVA_GC_LOG_KEEP_FILES
-
How many of the most recent files to keep.
export JAVA_GC_LOG_KEEP_FILES=10
-
JAVA_GC_LOG_MAX_FILE_SIZE
-
How big GC logs should grow before triggering log rotation.
export JAVA_GC_LOG_MAX_FILE_SIZE=10m
Additional configuration tasks
editRefer to the following for further documentation on specific configuration tasks: