Configuration

edit

Configuration

edit

Enterprise Search requires little configuration to get started. However, for flexibility, the solution provides many configurable settings.

This document explains how to modify Enterprise Search configuration settings. It also provides a reference for each configuration setting and the configuration settings format.

Configure Enterprise Search

edit

Configure Enterprise Search by setting the values of various configuration settings. All deployments use the same configuration settings format, but access to the settings varies by deployment type.

Refer to the section for your deployment type:

Self-managed deployments can also set default values for some environment variables read by Enterprise Search.

Elastic Cloud

edit

Configure Enterprise Search on Elastic Cloud using custom user settings.

See Add Enterprise Search user settings in the Elastic Cloud documentation.

Elastic Cloud Enterprise (ECE)

edit

Configure Enterprise Search on Elastic Cloud Enterprise using custom user settings.

See Add Enterprise Search user settings in the Elastic Cloud Enterprise documentation.

Elastic Cloud on Kubernetes (ECK)

edit

Configure Enterprise Search on Elastic Cloud on Kubernetes (ECK) by editing the YAML specification.

See Configuration in the Elastic Cloud on Kubernetes documentation.

Tar, deb, and rpm packages

edit

When installed using a tar, deb, or rpm package, configure Enterprise Search using a configuration file.

The location of the configuration file varies by package type:

.tar archives
config/enterprise_search.yml
.deb and .rpm packages
/usr/share/enterprise_search/config/enterprise_search.yml

Docker

edit

When running with docker or docker-compose, configure Enterprise Search using environment variables.

Refer to the following examples:

Configuration settings format

edit

The Enterprise Search configuration follows the YAML format.

You can nest multi-node configuration settings:

elasticsearch:
  host: http://127.0.0.1:9200
  username: elastic
  password: changeme

Or you can flatten configuration settings:

elasticsearch.host: http://127.0.0.1:9200
elasticsearch.username: elastic
elasticsearch.password: changeme

You can format non-scalar values as sequences:

secret_management.encryption_keys:
  - O9noPkMWqBTmae3hnvscNZnxXjDEl
  - 3D0LNI0iibBbjXhJGpx0lncGpwy0z

Or you can format non-scalar values as arrays:

secret_management.encryption_keys: ['O9noPkMWqBTmae3hnvscNZnxXjDEl', '3D0LNI0iibBbjXhJGpx0lncGpwy0z']

You can interpolate values from environment variables using ${}:

secret_management.encryption_keys: [${KEY_1}, ${KEY_2}]

Configuration settings reference

edit

The following settings are available to configure Enterprise Search.

Elastic Enterprise Search comes with reasonable defaults. Before adjusting the configuration, make sure you understand what you are trying to accomplish and the consequences.

For passwords, the use of environment variables is encouraged to keep values from being written to disk. For example: elasticsearch.password: ${ELASTICSEARCH_PASSWORD:changeme}

Secrets

edit
secret_management.encryption_keys

Encryption keys to protect your application secrets. This field is required.

secret_management.encryption_keys: []

Elasticsearch

edit
allow_es_settings_modification

Enterprise Search needs one-time permission to alter Elasticsearch settings. Ensure the Elasticsearch settings are correct, then set the following to true. Or, adjust Elasticsearch’s config/elasticsearch.yml instead.

allow_es_settings_modification: false

elasticsearch.host

Elasticsearch full cluster URL.

elasticsearch.host: http://127.0.0.1:9200

elasticsearch.username

Elasticsearch credentials.

elasticsearch.username: elastic

elasticsearch.password

Elasticsearch credentials.

elasticsearch.password: changeme

elasticsearch.headers

Elasticsearch custom HTTP headers to add to each request.

elasticsearch.headers: 'X-My-Header: Contents of the header'

elasticsearch.ssl.enabled

SSL communication with Elasticsearch enabled or not.

elasticsearch.ssl.enabled: false

elasticsearch.ssl.certificate

Path to client certificate file to use for client-side validation from Elasticsearch.

elasticsearch.ssl.certificate_authority

Path to the keystore that contains Certificate Authorities for Elasticsearch SSL certificate.

elasticsearch.ssl.key

Path to the key file for the client certificate.

elasticsearch.ssl.key_passphrase

Passphrase for the above key file.

elasticsearch.ssl.verify

true to verify SSL certificate from Elasticsearch, false otherwise.

elasticsearch.ssl.verify: true

elasticsearch.startup_retry.enabled

Elasticsearch startup retry.

elasticsearch.startup_retry.enabled: true

elasticsearch.startup_retry.interval

Elasticsearch startup retry.

elasticsearch.startup_retry.interval: 5 # seconds

elasticsearch.startup_retry.fail_after

Elasticsearch startup retry.

elasticsearch.startup_retry.fail_after: 600 # seconds

Kibana

edit
kibana.host

The primary URL at which users interact with Kibana. This is used when Enterprise Search links users to Kibana.

kibana.external_url

Define the exposed URL at which users will reach Kibana. Defaults to kibana.host.

kibana.headers

Custom HTTP headers to add to requests made to Kibana from Enterprise Search.

kibana.headers: 'X-My-Header: Contents of the header'

kibana.startup_retry.enabled

Kibana startup retry.

kibana.startup_retry.enabled: false

kibana.startup_retry.interval

Kibana startup retry.

kibana.startup_retry.interval: 5 # seconds

kibana.startup_retry.fail_after

Kibana startup retry.

kibana.startup_retry.fail_after: 600 # seconds

Hosting and network

edit
ent_search.external_url logo cloud

Define the exposed URL at which users will reach Enterprise Search. Defaults to localhost:3002 for testing purposes. Most cases will use one of:

  • An IP: http://255.255.255.255
  • A FQDN: http://example.com
  • Shortname defined via /etc/hosts: http://ent-search.search

    ent_search.external_url: http://localhost:3002

ent_search.listen_host

Web application listen_host. Your application will run on this host. Must be a valid IPv4 or IPv6 address.

ent_search.listen_host: 127.0.0.1

ent_search.listen_port

Web application listen_port. Your application will run on this host and port. Must be a valid port number (1-65535).

ent_search.listen_port: 3002

Authentication

edit
ent_search.auth.<auth_name>

Authentication settings are used for the standalone Enterprise Search interface. See User interfaces. Auth name associated with the options being set up. If realm chains are configured in elasticsearch.yml for the associated Elasticsearch instance, then the names of the realms should also be used here. Multiple auth providers may be configured. Each must have a unique name.

ent_search.auth.<auth_name>.source

The origin of authenticated Enterprise Search users. Options are elasticsearch-native and elasticsearch-saml. See Users and access.

  • elasticsearch-native: Users are managed via the Elasticsearch native realm.
  • elasticsearch-saml: Users are managed via the Elasticsearch SAML realm.
ent_search.auth.<auth_name>.order

Auth providers are consulted in ascending order (that is to say, the realm with the lowest order value is consulted first). You should make sure each configured realm has a distinct order setting.

ent_search.auth.<auth_name>.description

The name to be displayed on the login screen associated with this provider.

ent_search.auth.<auth_name>.icon

The URL to an icon to be displayed on the login screen associated with this provider.

ent_search.auth.<auth_name>.hidden

Boolean value to determine whether or not to display this login option on the login screen. It is common to hide an option if you would like to create role mappings before allowing the option to be used as a valid login mechanism.

ent_search.auth.<auth_name>.hidden: false

ent_search.login_assistance_message

Adds a message to the login screen. Useful for displaying information about maintenance windows, links to corporate sign up pages, etc. This field supports Markdown.

Limits

edit

Configurable limits for Enterprise Search.

Overriding the default limits can impact performance negatively. Also, changing a limit here does not actually guarantee that Enterprise Search will work as expected as related Elasticsearch limits can be exceeded.

Workplace Search
edit
workplace_search.content_source.document_size.limit logo cloud

Configure the maximum allowed document size for a content source.

workplace_search.content_source.document_size.limit: 100kb

workplace_search.content_source.total_fields.limit logo cloud

Configure how many fields a content source can have.

The Elasticsearch/Lucene setting indices.query.bool.max_clause_count might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.

workplace_search.content_source.total_fields.limit: 64

workplace_search.content_source.sync.enabled logo cloud

Configure whether or not workplace search can run synchronization jobs. If this is set to false, no syncs will run. Default is true.

workplace_search.content_source.sync.enabled: true

workplace_search.content_source.sync.max_errors logo cloud

Configure how many errors to tolerate in a sync job. If the job encounters more total errors than this value, the job will fail. This only applies to errors tied to individual documents.

workplace_search.content_source.sync.max_errors: 1000

workplace_search.content_source.sync.max_consecutive_errors logo cloud

Configure how many errors in a row to tolerate in a sync job. If the job encounters more errors in a row than this value, the job will fail. This only applies to errors tied to individual documents.

workplace_search.content_source.sync.max_consecutive_errors: 10

workplace_search.content_source.sync.max_error_ratio logo cloud

Configure the ratio of <errored documents> / <total documents> to tolerate in a sync job or in a rolling window (see workplace_search.content_source.sync.error_ratio_window_size). If the job encounters an error ratio greater than this value in a given window, or overall at the end of the job, the job will fail. This only applies to errors tied to individual documents.

workplace_search.content_source.sync.max_error_ratio: 0.15

workplace_search.content_source.sync.error_ratio_window_size logo cloud

Configure how large of a window to consider when calculating an error ratio (see workplace_search.content_source.sync.max_error_ratio).

workplace_search.content_source.sync.error_ratio_window_size: 100

workplace_search.content_source.sync.thumbnails.enabled logo cloud

Configure whether or not a content source should generate thumbnails for the documents it syncs. Not all file types/sizes/content or Content Sources support thumbnail generation, even if this is enabled.

workplace_search.content_source.sync.thumbnails.enabled: true

workplace_search.content_source.indexing.rules.limit logo cloud

Configure how many indexing rules a content source can have.

workplace_search.content_source.indexing.rules.limit: 100

workplace_search.content_source.sync.refresh_interval.full logo cloud

Configure the refresh interval for full sync job (in ISO 8601 Duration format).

workplace_search.content_source.sync.refresh_interval.full: P3D

workplace_search.content_source.sync.refresh_interval.incremental logo cloud

Configure the refresh interval for incremental sync job (in ISO 8601 Duration format).

workplace_search.content_source.sync.refresh_interval.incremental: PT2H

workplace_search.content_source.sync.refresh_interval.delete logo cloud

Configure the refresh interval for delete sync job (in ISO 8601 Duration format).

workplace_search.content_source.sync.refresh_interval.delete: PT6H

workplace_search.content_source.sync.refresh_interval.permissions logo cloud

Configure the refresh interval for permissions sync job (in ISO 8601 Duration format).

workplace_search.content_source.sync.refresh_interval.permissions: PT5M

workplace_search.content_source.salesforce.enable_cases logo cloud

Configure whether or not Salesforce and Salesforce Sandbox connectors should sync Cases.

workplace_search.content_source.salesforce.enable_cases: true

workplace_search.synonyms.sets.limit logo cloud

Configure total number of synonym sets a Workplace Search instance can have.

workplace_search.synonyms.sets.limit: 256

workplace_search.synonyms.terms_per_set.limit logo cloud

Configure total number of terms an individual synonym set can have.

workplace_search.synonyms.terms_per_set.limit: 32

workplace_search.remote_sources.query_timeout logo cloud

Configure the query timeout (in milliseconds) for remote sources via the Search API.

workplace_search.remote_sources.query_timeout: 10000

workplace_search.content_source.localhost_base_urls.enabled

Configure whether to allow localhost URLs as base URLs in content sources (by default, they are not allowed).

workplace_search.content_source.localhost_base_urls.enabled: false

App Search
edit
app_search.engine.document_size.limit logo cloud

Configure the maximum allowed document size.

app_search.engine.document_size.limit: 100kb

app_search.engine.total_fields.limit logo cloud

Configure how many fields an engine can have. The Elasticsearch/Lucene setting indices.query.bool.max_clause_count might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.

app_search.engine.total_fields.limit: 64

app_search.engine.source_engines_per_meta_engine.limit logo cloud

Configure how many source engines a meta engine can have.

app_search.engine.source_engines_per_meta_engine.limit: 15

app_search.engine.total_facet_values_returned.limit logo cloud

Configure how many facet values can be returned by a search.

app_search.engine.total_facet_values_returned.limit: 250

app_search.engine.query.limit logo cloud

Configure how big full-text queries are allowed. The Elasticsearch/Lucene setting indices.query.bool.max_clause_count might also need to be adjusted if "Max clause count exceeded" errors start occurring. See Search settings in the Elasticsearch documentation.

app_search.engine.query.limit: 128

app_search.engine.synonyms.sets.limit logo cloud

Configure total number of synonym sets an engine can have.

app_search.engine.synonyms.sets.limit: 256

app_search.engine.synonyms.terms_per_set.limit logo cloud

Configure total number of terms a synonym set can have.

app_search.engine.synonyms.terms_per_set.limit: 32

app_search.engine.analytics.total_tags.limit logo cloud

Configure how many analytics tags can be associated with a single query or clickthrough.

app_search.engine.analytics.total_tags.limit: 16

Workers

edit
worker.threads

Configure the number of worker threads.

worker.threads: 1

APIs

edit
hide_version_info

Set to true hide product version information from API responses.

hide_version_info: false

Mailer

edit
email.account.enabled

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.enabled: false

email.account.smtp.auth

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.smtp.auth: plain

email.account.smtp.starttls.enable

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.smtp.starttls.enable: false

email.account.smtp.host

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.smtp.host: 127.0.0.1

email.account.smtp.port

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.smtp.port: 25

email.account.smtp.user

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.smtp.password

Connect Enterprise Search to a mailer. See Configure a mail service.

email.account.email_defaults.from

Connect Enterprise Search to a mailer. See Configure a mail service.

Logging

edit
log_directory

Choose your log export path.

log_directory: log

log_level

Log level can be: debug, info, warn, error, fatal, or unknown.

log_level: info

log_format

Log format can be: default, json

log_format: default

filebeat_log_directory

Choose your Filebeat logs export path.

filebeat_log_directory: log

ilm.enabled

Use Index Lifecycle Management (ILM) to manage analytics and API logs retention.

  • auto: Use ILM when supported by the underlying Elasticsearch cluster
  • true: Use ILM (requires ILM support in the underlying Elasticsearch cluster)
  • false: Don’t use ILM (analytics and API logs will grow unconstrained)

    See Log settings guide in the App Search documentation.

    ilm.enabled: auto

enable_stdout_app_logging

Enable logging app logs to stdout (enabled by default).

enable_stdout_app_logging: true

log_rotation.keep_files

The number of files to keep on disk when rotating logs. When set to 0, no rotation will take place.

log_rotation.keep_files: 7

log_rotation.rotate_every_bytes

The maximum file size in bytes before rotating the log file. If log_rotation.keep_files is set to 0, no rotation will take place and there will be no size limit for the singular log file.

log_rotation.rotate_every_bytes: 1048576 # 1 MiB

TLS/SSL

edit
ent_search.ssl.enabled

Configure TLS/SSL encryption.

ent_search.ssl.enabled: false

ent_search.ssl.keystore.path

Configure TLS/SSL encryption.

ent_search.ssl.keystore.password

Configure TLS/SSL encryption.

ent_search.ssl.keystore.key_password

Configure TLS/SSL encryption.

ent_search.ssl.redirect_http_from_port

Configure TLS/SSL encryption.

Session

edit
secret_session_key

Set a session key to persist user sessions through process restarts.

APM Instrumentation

edit
apm.enabled logo cloud

Enable Elastic APM agent within Enterprise Search.

apm.enabled: true

apm.server_url logo cloud

Set the custom APM Server URL.

apm.server_url: 'http://localhost:8200'

apm.secret_token logo cloud

Set the APM authentication token (use if APM Server requires a secret token).

apm.secret_token: 'your-token-here'

apm.service_name

Override the APM service name. Allowed characters: a-z, A-Z, 0-9, -, _ and space.

apm.service_name: 'Enterprise Search'

apm.environment

Override the APM service environment.

apm.environment: 'production'

Monitoring

edit
monitoring.reporting_enabled logo cloud

Enable automatic monitoring metrics reporting to Elasticsearch via metricbeat.

monitoring.reporting_enabled: false

monitoring.reporting_period logo cloud

Configure metrics reporting frequency. This setting should be aligned with monitoring.ui.min_interval_seconds setting in Kibana, or Stack Monitoring dashboards for Enterprise Search may have gaps in graphs on high metric resolutions.

monitoring.reporting_period: 10s

monitoring.metricsets logo cloud

Configure metricsets to be reported to Elasticsearch

monitoring.metricsets: ['health', 'stats']

monitoring.index_prefix logo cloud

Override the index name prefix used to index Enterprise Search metrics. The index will have ILM enabled and will be managed by Enterprise Search.

monitoring.index_prefix: metricbeat-ent-search

Telemetry

edit
telemetry.enabled

Reporting your basic feature usage statistics helps us improve your user experience. Your data is never shared with anyone. If kibana.external_url is set, the analogous Kibana Telemetry settings will take precedence. Set to false to disable telemetry capabilities entirely. You can alternatively opt out through the Settings page.

telemetry.enabled: true

telemetry.opt_in

If false, collection of telemetry data is disabled; however, it can be enabled via the Settings page if telemetry.allow_changing_opt_in_status is true.

telemetry.opt_in: true

telemetry.allow_changing_opt_in_status

If true, users are able to change the telemetry setting at a later time through the Settings page. If false, the value of telemetry.opt_in determines whether to send telemetry data or not.

telemetry.allow_changing_opt_in_status: true

Diagnostics report

edit
diagnostic_report_directory

Path where diagnostic reports will be generated.

diagnostic_report_directory: diagnostics

Elastic crawler

edit
crawler.http.user_agent logo cloud

The User-Agent HTTP Header used for the Crawler.

crawler.http.user_agent: Elastic-Crawler (<crawler_version_number>)

crawler.http.user_agent_platform

The user agent platform used for the Crawler with identifying information. See User-Agent - Syntax in the MDN web docs.

This value will be added as a suffix to crawler.http.user_agent and used as the final User-Agent Header.

crawler.workers.pool_size.limit logo cloud

The number of parallel crawls allowed per instance of Enterprise Search. By default, it is set to 2x the number of available logical CPU cores. On Intel CPUs, the default value is 4x the number of physical CPU cores due to hyper-threading. See Hyper-threading on Wikipedia.

crawler.workers.pool_size.limit: N

Per-crawl Resource Limits
edit

These limits guard against infinite loops and other traps common to production web crawlers. If your crawler is hitting these limits, try changing your crawl rules or the content you’re crawling. Adjust these limits as a last resort.

crawler.crawl.max_duration.limit logo cloud

The maximum duration of a crawl, in seconds. Beyond this limit, the crawler will stop, abandoning all remaining URLs in the crawl queue.

crawler.crawl.max_duration.limit: 86400 # seconds

crawler.crawl.max_crawl_depth.limit logo cloud

The maximum number of sequential pages the crawler will traverse starting from the given set of entry points. Beyond this limit, the crawler will stop discovering new links.

crawler.crawl.max_crawl_depth.limit: 10

crawler.crawl.max_url_length.limit logo cloud

The maximum number of characters within each URL to crawl. The crawler will skip URLs that exceed this length.

crawler.crawl.max_url_length.limit: 2048

crawler.crawl.max_url_segments.limit logo cloud

The maximum number of segments within the path of each URL to crawl. The crawler will skip URLs whose paths exceed this length. Example: The path /a/b/c/d has 4 segments.

crawler.crawl.max_url_segments.limit: 16

crawler.crawl.max_url_params.limit logo cloud

The maximum number of query parameters within each URL to crawl. The crawler will skip URLs that exceed this length. Example: The query string in /a?b=c&d=e has 2 query parameters.

crawler.crawl.max_url_params.limit: 32

crawler.crawl.max_unique_url_count.limit logo cloud

The maximum number of unique URLs the crawler will index during a single crawl. Beyond this limit, the crawler will stop.

crawler.crawl.max_unique_url_count.limit: 100000

Advanced Per-crawl Limits
edit
crawler.crawl.threads.limit logo cloud

The number of parallel threads to use for each crawl. The main effect from increasing this value will be an increased throughput of the crawler at the expense of higher CPU load on Enterprise Search and Elasticsearch instances as well as higher load on the website being crawled.

crawler.crawl.threads.limit: 10

crawler.crawl.url_queue.url_count.limit logo cloud

The maximum size of the crawl frontier - the list of URLs the crawler needs to visit. The list is stored in Elasticsearch, so the limit could be increased as long as the Elasticsearch cluster has enough resources (disk space) to hold the queue index.

crawler.crawl.url_queue.url_count.limit: 100000

Per-Request Timeout Limits
edit
crawler.http.connection_timeout logo cloud

The maximum period to wait until abortion of the request, when a connection is being initiated.

crawler.http.connection_timeout: 10 # seconds

crawler.http.read_timeout logo cloud

The maximum period of inactivity between two data packets, before the request is aborted.

crawler.http.read_timeout: 10 # seconds

crawler.http.request_timeout logo cloud

The maximum period of the entire request, before the request is aborted.

crawler.http.request_timeout: 60 # seconds

Per-Request Resource Limits
edit
crawler.http.response_size.limit logo cloud

The maximum size of an HTTP response (in bytes) supported by the crawler.

crawler.http.response_size.limit: 10485760

crawler.http.redirects.limit logo cloud

The maximum number of HTTP redirects before a request is failed.

crawler.http.redirects.limit: 10

Content Extraction Resource Limits
edit
crawler.extraction.title_size.limit logo cloud

The maximum size (in bytes) of some fields extracted from crawled pages.

crawler.extraction.title_size.limit: 1024

crawler.extraction.body_size.limit logo cloud

The maximum size (in bytes) of some fields extracted from crawled pages.

crawler.extraction.body_size.limit: 5242880

crawler.extraction.keywords_size.limit logo cloud

The maximum size (in bytes) of some fields extracted from crawled pages.

crawler.extraction.keywords_size.limit: 512

crawler.extraction.description_size.limit logo cloud

The maximum size (in bytes) of some fields extracted from crawled pages.

crawler.extraction.description_size.limit: 1024

crawler.extraction.extracted_links_count.limit logo cloud

The maximum number of links extracted from each page for further crawling.

crawler.extraction.extracted_links_count.limit: 1000

crawler.extraction.indexed_links_count.limit logo cloud

The maximum number of links extracted from each page and indexed in a document.

crawler.extraction.indexed_links_count.limit: 25

crawler.extraction.headings_count.limit logo cloud

The maximum number of HTML headers to be extracted from each page.

crawler.extraction.headings_count.limit: 25

crawler.extraction.default_deduplication_fields logo cloud

Default document fields used to compare documents during de-duplication.

crawler.extraction.default_deduplication_fields: ['title', 'body_content', 'meta_keywords', 'meta_description', 'links', 'headings']

Crawler HTTP Security Controls
edit
crawler.security.ssl.certificate_authorities logo cloud

A list of custom SSL Certificate Authority certificates to be used for all connections made by the crawler to your websites. These certificates are added to the standard list of CA certificates trusted by the JVM. Each item in this list could be a file name of a certificate in PEM format or a PEM-formatted certificate as a string.

crawler.security.ssl.certificate_authorities: []

crawler.security.ssl.verification_mode logo cloud

Control SSL verification mode used by the crawler:

  • full - validate both the SSL certificate and the hostname presented by the server (this is the default and the recommended value)
  • certificate - only validate the SSL certificate presented by the server
  • none - disable SSL validation completely (this is very dangerous and should never be used in production deployments).

    crawler.security.ssl.verification_mode: full

Crawler DNS Security Controls
edit

The settings in this section could make your deployment vulnerable to SSRF attacks (especially in cloud environments) from the owners of any domains you crawl. Do not enable any of the settings here unless you fully control DNS domains you access with the crawler. See Server Side Request Forgery on OWASP for more details on the SSRF attack and the risks associated with it.

crawler.security.dns.allow_loopback_access

Allow crawler to access the localhost (127.0.0.0/8 IP namespace).

crawler.security.dns.allow_loopback_access: false

crawler.security.dns.allow_private_networks_access

Allow crawler to access the private IP space: link-local, network-local addresses, etc. See Reserved IP addresses - IPv4 on Wikipedia for more details.

crawler.security.dns.allow_private_networks_access: false

Crawler HTTP Proxy Settings
edit
crawler.http.proxy.host logo cloud

If you need the Crawler to send HTTP requests through a proxy, you can configure with this setting. Please note:

  • Only unauthenticated HTTP and HTTPS proxies are supported at the moment.
  • Your proxy connections are subject to the DNS security controls described above (if your proxy server is running on a private or a loopback address, you will need to explicitly allow the crawler to connect to it).
crawler.http.proxy.port logo cloud

If you need the Crawler to send HTTP requests through a proxy, you can configure with this setting. Please note:

  • Only unauthenticated HTTP and HTTPS proxies are supported at the moment.
  • Your proxy connections are subject to the DNS security controls described above (if your proxy server is running on a private or a loopback address, you will need to explicitly allow the crawler to connect to it).

    crawler.http.proxy.port: 8080

crawler.http.proxy.protocol logo cloud

Protocol to be used for connecting to the proxy: http (default) or https.

crawler.http.proxy.protocol: http

crawler.http.proxy.username logo cloud

Basic HTTP credentials to be used for connecting to the proxy.

crawler.http.proxy.password logo cloud

Basic HTTP credentials to be used for connecting to the proxy.

Read-only mode

edit
skip_read_only_check

If true, pending migrations can be executed without enabling read-only mode. Proceeding with migrations while indices are allowing writes can have unintended consequences. Use at your own risk, should not be set to true when upgrading a production instance with ongoing traffic.

skip_read_only_check: false

Environment variables reference

edit

Self-managed deployments can set default values for the following environment variables read by Enterprise Search.

Set these values within config/env.sh.

JAVA_OPTS

Java options for JVM tuning (used for app-server and CLI commands).

export JAVA_OPTS=${JAVA_OPTS:-"-Xms2g -Xmx2g"}

APP_SERVER_JAVA_OPTS

Additional Java options for the application server.

export APP_SERVER_JAVA_OPTS="${APP_SERVER_JAVA_OPTS:-}"

JAVA_GC_LOGGING

Enable Java GC logging (see below for the default configuration).

export JAVA_GC_LOGGING=true

JAVA_GC_LOG_DIR

Where to put the files.

export JAVA_GC_LOG_DIR=log

JAVA_GC_LOG_KEEP_FILES

How many of the most recent files to keep.

export JAVA_GC_LOG_KEEP_FILES=10

JAVA_GC_LOG_MAX_FILE_SIZE

How big GC logs should grow before triggering log rotation.

export JAVA_GC_LOG_MAX_FILE_SIZE=10m

Additional configuration tasks

edit

Refer to the following for further documentation on specific configuration tasks: