Configuration
editConfiguration
editEnterprise Search requires little configuration to get started. However, for flexibility, the solution provides many configurable settings.
This document explains how to modify Enterprise Search configuration settings. It also provides a reference for each configuration setting and the configuration settings format.
Configure Enterprise Search
editConfigure Enterprise Search by setting the values of various configuration settings. All deployments use the same configuration settings format, but access to the settings varies by deployment type.
Refer to the section for your deployment type:
Self-managed deployments can also set default values for some environment variables read by Enterprise Search.
Elastic Cloud
editConfigure Enterprise Search on Elastic Cloud using custom user settings.
See Add Enterprise Search user settings in the Elastic Cloud documentation.
Elastic Cloud Enterprise (ECE)
editConfigure Enterprise Search on Elastic Cloud Enterprise using custom user settings.
See Add Enterprise Search user settings in the Elastic Cloud Enterprise documentation.
Elastic Cloud on Kubernetes (ECK)
editConfigure Enterprise Search on Elastic Cloud on Kubernetes (ECK) by editing the YAML specification.
See Configuration in the Elastic Cloud on Kubernetes documentation.
Tar, deb, and rpm packages
editWhen installed using a tar, deb, or rpm package, configure Enterprise Search using a configuration file.
The location of the configuration file varies by package type:
-
.tar
archives -
config/enterprise_search.yml
-
.deb
and.rpm
packages -
/usr/share/enterprise_search/config/enterprise_search.yml
Docker
editWhen running with docker
or docker-compose
, configure Enterprise Search using environment variables.
Refer to the following examples:
Configuration settings format
editThe Enterprise Search configuration follows the YAML format.
You can nest multi-node configuration settings:
elasticsearch: host: http://127.0.0.1:9200 username: elastic password: changeme
Or you can flatten configuration settings:
elasticsearch.host: http://127.0.0.1:9200 elasticsearch.username: elastic elasticsearch.password: changeme
You can format non-scalar values as sequences:
secret_management.encryption_keys: - O9noPkMWqBTmae3hnvscNZnxXjDEl - 3D0LNI0iibBbjXhJGpx0lncGpwy0z
Or you can format non-scalar values as arrays:
secret_management.encryption_keys: ['O9noPkMWqBTmae3hnvscNZnxXjDEl', '3D0LNI0iibBbjXhJGpx0lncGpwy0z']
You can interpolate values from environment variables using ${}
:
secret_management.encryption_keys: [${KEY_1}, ${KEY_2}]
Configuration settings reference
editThe following settings are available to configure Enterprise Search.
Elastic Enterprise Search comes with reasonable defaults. Before adjusting the configuration, make sure you understand what you are trying to accomplish and the consequences.
For passwords, the use of environment variables is encouraged to keep values from being written to disk. For example: elasticsearch.password: ${ELASTICSEARCH_PASSWORD:changeme}
Secrets
edit-
secret_management.encryption_keys
-
Encryption keys to protect your application secrets. This field is required.
secret_management.encryption_keys: []
Elasticsearch
edit-
allow_es_settings_modification
-
Enterprise Search needs one-time permission to alter Elasticsearch settings. Ensure the Elasticsearch settings are correct, then set the following to true. Or, adjust Elasticsearch’s config/elasticsearch.yml instead.
allow_es_settings_modification: false
-
elasticsearch.headers
-
Elasticsearch custom HTTP headers to add to each request.
elasticsearch.headers: 'X-My-Header: Contents of the header'
-
elasticsearch.ssl.enabled
-
SSL communication with Elasticsearch enabled or not.
elasticsearch.ssl.enabled: false
-
elasticsearch.ssl.certificate
-
Path to client certificate file to use for client-side validation from Elasticsearch.
-
elasticsearch.ssl.certificate_authority
-
Path to the keystore that contains Certificate Authorities for Elasticsearch SSL certificate.
-
elasticsearch.ssl.verify
-
true to verify SSL certificate from Elasticsearch, false otherwise.
elasticsearch.ssl.verify: true
-
elasticsearch.startup_retry.enabled
-
Elasticsearch startup retry.
elasticsearch.startup_retry.enabled: true
-
elasticsearch.startup_retry.interval
-
Elasticsearch startup retry.
elasticsearch.startup_retry.interval: 5 # seconds
-
elasticsearch.startup_retry.fail_after
-
Elasticsearch startup retry.
elasticsearch.startup_retry.fail_after: 600 # seconds
Kibana
edit-
kibana.host
-
The primary URL at which users interact with Kibana. This is used when Enterprise Search links users to Kibana.
-
kibana.external_url
-
Define the exposed URL at which users will reach Kibana. Defaults to
kibana.host
.
-
kibana.headers
-
Custom HTTP headers to add to requests made to Kibana from Enterprise Search.
kibana.headers: 'X-My-Header: Contents of the header'
-
kibana.startup_retry.fail_after
-
Kibana startup retry.
kibana.startup_retry.fail_after: 600 # seconds
Hosting and network
edit-
ent_search.external_url
-
Define the exposed URL at which users will reach Enterprise Search. Defaults to localhost:3002 for testing purposes. Most cases will use one of:
-
An IP:
http://255.255.255.255
-
A FQDN:
http://example.com
-
Shortname defined via /etc/hosts:
http://ent-search.search
ent_search.external_url: http://localhost:3002
-
An IP:
-
ent_search.listen_host
-
Web application listen_host. Your application will run on this host. Must be a valid IPv4 or IPv6 address.
ent_search.listen_host: 127.0.0.1
-
ent_search.listen_port
-
Web application listen_port. Your application will run on this host and port. Must be a valid port number (1-65535).
ent_search.listen_port: 3002
Authentication
edit-
ent_search.auth.<auth_name>
-
Authentication settings are used for the standalone Enterprise Search interface. See User interfaces. Auth name associated with the options being set up. If realm chains are configured in elasticsearch.yml for the associated Elasticsearch instance, then the names of the realms should also be used here. Multiple auth providers may be configured. Each must have a unique name.
-
ent_search.auth.<auth_name>.source
-
The origin of authenticated Enterprise Search users. Options are
elasticsearch-native
andelasticsearch-saml
. See Users and access.-
elasticsearch-native
: Users are managed via the Elasticsearch native realm. -
elasticsearch-saml
: Users are managed via the Elasticsearch SAML realm.
-
-
ent_search.auth.<auth_name>.order
-
Auth providers are consulted in ascending order (that is to say, the realm with the lowest order value is consulted first). You should make sure each configured realm has a distinct order setting.
-
ent_search.auth.<auth_name>.description
-
The name to be displayed on the login screen associated with this provider.
-
ent_search.auth.<auth_name>.icon
-
The URL to an icon to be displayed on the login screen associated with this provider.
-
ent_search.auth.<auth_name>.hidden
-
Boolean value to determine whether or not to display this login option on the login screen. It is common to hide an option if you would like to create role mappings before allowing the option to be used as a valid login mechanism.
ent_search.auth.<auth_name>.hidden: false
-
ent_search.login_assistance_message
-
Adds a message to the login screen. Useful for displaying information about maintenance windows, links to corporate sign up pages, etc. This field supports Markdown.
Limits
editConfigurable limits for Enterprise Search.
Overriding the default limits can impact performance negatively. Also, changing a limit here does not actually guarantee that Enterprise Search will work as expected as related Elasticsearch limits can be exceeded.
Workplace Search
edit-
workplace_search.content_source.total_fields.limit
-
Configure how many fields a content source can have.
The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.workplace_search.content_source.total_fields.limit: 64
-
workplace_search.content_source.sync.max_consecutive_errors
-
Configure how many errors in a row to tolerate in a sync job. If the job encounters more errors in a row than this value, the job will fail. This only applies to errors tied to individual documents.
workplace_search.content_source.sync.max_consecutive_errors: 10
-
workplace_search.content_source.sync.max_error_ratio
-
Configure the ratio of <errored documents> / <total documents> to tolerate in a sync job or in a rolling window (see
workplace_search.content_source.sync.error_ratio_window_size
). If the job encounters an error ratio greater than this value in a given window, or overall at the end of the job, the job will fail. This only applies to errors tied to individual documents.workplace_search.content_source.sync.max_error_ratio: 0.15
-
workplace_search.content_source.sync.thumbnails.enabled
-
Configure whether or not a content source should generate thumbnails for the documents it syncs. Not all file types/sizes/content or Content Sources support thumbnail generation, even if this is enabled.
workplace_search.content_source.sync.thumbnails.enabled: true
-
workplace_search.content_source.localhost_base_urls.enabled
-
Configure whether to allow localhost URLs as base URLs in content sources (by default, they are not allowed).
workplace_search.content_source.localhost_base_urls.enabled: false
App Search
edit-
app_search.engine.total_fields.limit
-
Configure how many fields an engine can have. The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.app_search.engine.total_fields.limit: 64
-
app_search.engine.query.limit
-
Configure how big full-text queries are allowed. The Elasticsearch/Lucene setting
indices.query.bool.max_clause_count
might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.app_search.engine.query.limit: 128
Workers
editAPIs
edit-
hide_version_info
-
Set to true hide product version information from API responses.
hide_version_info: false
Mailer
edit-
email.account.enabled
-
Connect Enterprise Search to a mailer. See Configure a mail service.
email.account.enabled: false
-
email.account.smtp.auth
-
Connect Enterprise Search to a mailer. See Configure a mail service.
email.account.smtp.auth: plain
-
email.account.smtp.starttls.enable
-
Connect Enterprise Search to a mailer. See Configure a mail service.
email.account.smtp.starttls.enable: false
-
email.account.smtp.host
-
Connect Enterprise Search to a mailer. See Configure a mail service.
email.account.smtp.host: 127.0.0.1
-
email.account.smtp.port
-
Connect Enterprise Search to a mailer. See Configure a mail service.
email.account.smtp.port: 25
-
email.account.smtp.user
-
Connect Enterprise Search to a mailer. See Configure a mail service.
-
email.account.smtp.password
-
Connect Enterprise Search to a mailer. See Configure a mail service.
-
email.account.email_defaults.from
-
Connect Enterprise Search to a mailer. See Configure a mail service.
Logging
edit-
ilm.enabled
-
Use Index Lifecycle Management (ILM) to manage analytics and API logs retention.
-
auto
: Use ILM when supported by the underlying Elasticsearch cluster -
true
: Use ILM (requires ILM support in the underlying Elasticsearch cluster) -
false
: Don’t use ILM (analytics and API logs will grow unconstrained)See Log settings guide in the App Search documentation.
ilm.enabled: auto
-
-
enable_stdout_app_logging
-
Enable logging app logs to stdout (enabled by default).
enable_stdout_app_logging: true
-
log_rotation.keep_files
-
The number of files to keep on disk when rotating logs. When set to 0, no rotation will take place.
log_rotation.keep_files: 7
-
log_rotation.rotate_every_bytes
-
The maximum file size in bytes before rotating the log file. If log_rotation.keep_files is set to 0, no rotation will take place and there will be no size limit for the singular log file.
log_rotation.rotate_every_bytes: 1048576 # 1 MiB
TLS/SSL
editSession
editAPM Instrumentation
edit-
apm.service_name
-
Override the APM service name. Allowed characters: a-z, A-Z, 0-9, -, _ and space.
apm.service_name: 'Enterprise Search'
Monitoring
editTelemetry
edit-
telemetry.enabled
-
Reporting your basic feature usage statistics helps us improve your user experience. Your data is never shared with anyone. If kibana.external_url is set, the analogous Kibana Telemetry settings will take precedence. Set to false to disable telemetry capabilities entirely. You can alternatively opt out through the Settings page.
telemetry.enabled: true
-
telemetry.opt_in
-
If false, collection of telemetry data is disabled; however, it can be enabled via the Settings page if telemetry.allow_changing_opt_in_status is true.
telemetry.opt_in: true
-
telemetry.allow_changing_opt_in_status
-
If true, users are able to change the telemetry setting at a later time through the Settings page. If false, the value of telemetry.opt_in determines whether to send telemetry data or not.
telemetry.allow_changing_opt_in_status: true
Diagnostics report
edit-
diagnostic_report_directory
-
Path where diagnostic reports will be generated.
diagnostic_report_directory: diagnostics
Elastic crawler
edit-
crawler.http.user_agent_platform
-
The user agent platform used for the Crawler with identifying information. See User-Agent - Syntax in the MDN web docs.
This value will be added as a suffix to
crawler.http.user_agent
and used as the final User-Agent Header.
-
crawler.workers.pool_size.limit
-
The number of parallel crawls allowed per instance of Enterprise Search. By default, it is set to 2x the number of available logical CPU cores. On Intel CPUs, the default value is 4x the number of physical CPU cores due to hyper-threading. See Hyper-threading on Wikipedia.
crawler.workers.pool_size.limit: N
Per-crawl Resource Limits
editThese limits guard against infinite loops and other traps common to production web crawlers. If your crawler is hitting these limits, try changing your crawl rules or the content you’re crawling. Adjust these limits as a last resort.
Advanced Per-crawl Limits
edit-
crawler.crawl.threads.limit
-
The number of parallel threads to use for each crawl. The main effect from increasing this value will be an increased throughput of the crawler at the expense of higher CPU load on Enterprise Search and Elasticsearch instances as well as higher load on the website being crawled.
crawler.crawl.threads.limit: 10
-
crawler.crawl.url_queue.url_count.limit
-
The maximum size of the crawl frontier - the list of URLs the crawler needs to visit. The list is stored in Elasticsearch, so the limit could be increased as long as the Elasticsearch cluster has enough resources (disk space) to hold the queue index.
crawler.crawl.url_queue.url_count.limit: 100000
Per-Request Timeout Limits
editPer-Request Resource Limits
editContent Extraction Resource Limits
editCrawler HTTP Security Controls
edit-
crawler.security.ssl.certificate_authorities
-
A list of custom SSL Certificate Authority certificates to be used for all connections made by the crawler to your websites. These certificates are added to the standard list of CA certificates trusted by the JVM. Each item in this list could be a file name of a certificate in PEM format or a PEM-formatted certificate as a string.
crawler.security.ssl.certificate_authorities: []
-
crawler.security.ssl.verification_mode
-
Control SSL verification mode used by the crawler:
-
full
- validate both the SSL certificate and the hostname presented by the server (this is the default and the recommended value) -
certificate
- only validate the SSL certificate presented by the server -
none
- disable SSL validation completely (this is very dangerous and should never be used in production deployments).crawler.security.ssl.verification_mode: full
-
Crawler DNS Security Controls
editThe settings in this section could make your deployment vulnerable to SSRF attacks (especially in cloud environments) from the owners of any domains you crawl. Do not enable any of the settings here unless you fully control DNS domains you access with the crawler. See Server Side Request Forgery on OWASP for more details on the SSRF attack and the risks associated with it.
-
crawler.security.dns.allow_loopback_access
-
Allow crawler to access the localhost (127.0.0.0/8 IP namespace).
crawler.security.dns.allow_loopback_access: false
-
crawler.security.dns.allow_private_networks_access
-
Allow crawler to access the private IP space: link-local, network-local addresses, etc. See Reserved IP addresses - IPv4 on Wikipedia for more details.
crawler.security.dns.allow_private_networks_access: false
Crawler HTTP Proxy Settings
edit-
crawler.http.proxy.host
-
If you need the Crawler to send HTTP requests through a proxy, you can configure with this setting. Please note:
- Only unauthenticated HTTP and HTTPS proxies are supported at the moment.
- Your proxy connections are subject to the DNS security controls described above (if your proxy server is running on a private or a loopback address, you will need to explicitly allow the crawler to connect to it).
-
crawler.http.proxy.port
-
If you need the Crawler to send HTTP requests through a proxy, you can configure with this setting. Please note:
- Only unauthenticated HTTP and HTTPS proxies are supported at the moment.
-
Your proxy connections are subject to the DNS security controls described above (if your proxy server is running on a private or a loopback address, you will need to explicitly allow the crawler to connect to it).
crawler.http.proxy.port: 8080
Read-only mode
edit-
skip_read_only_check
-
If true, pending migrations can be executed without enabling read-only mode. Proceeding with migrations while indices are allowing writes can have unintended consequences. Use at your own risk, should not be set to true when upgrading a production instance with ongoing traffic.
skip_read_only_check: false
Environment variables reference
editSelf-managed deployments can set default values for the following environment variables read by Enterprise Search.
Set these values within config/env.sh
.
-
JAVA_OPTS
-
Java options for JVM tuning (used for app-server and CLI commands).
export JAVA_OPTS=${JAVA_OPTS:-"-Xms2g -Xmx2g"}
-
APP_SERVER_JAVA_OPTS
-
Additional Java options for the application server.
export APP_SERVER_JAVA_OPTS="${APP_SERVER_JAVA_OPTS:-}"
-
JAVA_GC_LOGGING
-
Enable Java GC logging (see below for the default configuration).
export JAVA_GC_LOGGING=true
-
JAVA_GC_LOG_DIR
-
Where to put the files.
export JAVA_GC_LOG_DIR=log
-
JAVA_GC_LOG_KEEP_FILES
-
How many of the most recent files to keep.
export JAVA_GC_LOG_KEEP_FILES=10
-
JAVA_GC_LOG_MAX_FILE_SIZE
-
How big GC logs should grow before triggering log rotation.
export JAVA_GC_LOG_MAX_FILE_SIZE=10m
Additional configuration tasks
editRefer to the following for further documentation on specific configuration tasks: