Elasticsearch Guide: other versions:
What is Elasticsearch?
- Data in: documents and indices
- Information out: search and analyze
- Scalability and resilience
What’s new in 8.8
Set up Elasticsearch
- Installing Elasticsearch
- Run Elasticsearch locally
- Configuring Elasticsearch
- Important system configuration
- Bootstrap Checks
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Plugins
Upgrade Elasticsearch
- Archived settings
- Reading indices from older Elasticsearch versions
Index modules
- Analysis
- Index Shard Allocation
- Index blocks
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Preloading data into the file system cache
- Translog
- History retention
- Index Sorting
  - Use index sorting to speed up conjunctions
- Indexing pressure
Mapping
- Dynamic mapping
  - Dynamic field mapping
  - Dynamic templates
- Explicit mapping
- Runtime fields
- Field data types
  - Aggregate metric
  - Alias
  - Arrays
  - Binary
  - Boolean
  - Completion
  - Date
  - Date nanoseconds
  - Dense vector
  - Flattened
  - Geopoint
  - Geoshape
  - Histogram
  - IP
  - Join
  - Keyword
  - Nested
  - Numeric
  - Object
  - Percolator
  - Point
  - Range
  - Rank feature
  - Rank features
  - Search-as-you-type
  - Shape
  - Text
  - Token count
  - Unsigned long
  - Version
- Metadata fields
- Mapping parameters
- Mapping limit settings
- Removal of mapping types
Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
  - Fingerprint
  - Keyword
  - Language
  - Pattern
  - Simple
  - Standard
  - Stop
  - Whitespace
- Tokenizer reference
  - Character group
  - Classic
  - Edge n-gram
  - Keyword
  - Letter
  - Lowercase
  - N-gram
  - Path hierarchy
  - Pattern
  - Simple pattern
  - Simple pattern split
  - Standard
  - Thai
  - UAX URL email
  - Whitespace
- Token filter reference
- Character filters reference
- Normalizers
Index templates
- Simulate multi-component templates
- Config ignore_missing_component_templates
  - Usage example
Data streams
- Set up a data stream
- Use a data stream
- Modify a data stream
- Time series data stream (TSDS)
Ingest pipelines
- Example: Parse logs
- Enrich your data
- Processor reference
  - Append
  - Attachment
  - Bytes
  - Circle
  - Community ID
  - Convert
  - CSV
  - Date
  - Date index name
  - Dissect
  - Dot expander
  - Drop
  - Enrich
  - Fail
  - Fingerprint
  - Foreach
  - Geo-grid
  - GeoIP
  - Grok
  - Gsub
  - HTML strip
  - Inference
  - Join
  - JSON
  - KV
  - Lowercase
  - Network direction
  - Pipeline
  - Redact
  - Registered domain
  - Remove
  - Rename
  - Reroute
  - Script
  - Set
  - Set security user
  - Sort
  - Split
  - Trim
  - Uppercase
  - URL decode
  - URI parts
  - User agent
Aliases
Search your data
- Collapse search results
- Filter search results
- Highlighting
- Long-running searches
- Near real-time search
- Paginate search results
- Retrieve inner hits
- Retrieve selected fields
- Search across clusters
- Search multiple data streams and indices
- Search shard routing
- Search templates
- Sort search results
- kNN search
- Semantic search with ELSER
Query DSL
- Query and filter context
- Compound queries
- Full text queries
- Geo queries
- Shape queries
  - Shape
- Joining queries
  - Nested
  - Has child
  - Has parent
  - Parent ID
- Match all
- Span queries
- Specialized queries
- Term-level queries
  - Exists
  - Fuzzy
  - IDs
  - Prefix
  - Range
  - Regexp
  - Term
  - Terms
  - Terms set
  - Wildcard
- Text expansion
- minimum_should_match parameter
- rewrite parameter
- Regular expression syntax
Aggregations
- Bucket aggregations
- Metrics aggregations
  - Avg
  - Boxplot
  - Cardinality
  - Extended stats
  - Geo-bounds
  - Geo-centroid
  - Geo-Line
  - Cartesian-bounds
  - Cartesian-centroid
  - Matrix stats
  - Max
  - Median absolute deviation
  - Min
  - Percentile ranks
  - Percentiles
  - Rate
  - Scripted metric
  - Stats
  - String stats
  - Sum
  - T-test
  - Top hits
  - Top metrics
  - Value count
  - Weighted avg
- Pipeline aggregations
Geospatial analysis
EQL
- Syntax reference
- Function reference
- Pipe reference
- Example: Detect threats with EQL
SQL
- Overview
- Getting Started with SQL
- Conventions and Terminology
  - Mapping concepts across SQL and Elasticsearch
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
  - API usage
- SQL ODBC
  - Driver installation
  - Configuration
- SQL Client Applications
- SQL Language
- Functions and Operators
- Reserved keywords
- SQL Limitations
Scripting
- Painless scripting language
- How to write scripts
- Access fields in a document
- Common scripting use cases
  - Field extraction
- Accessing document fields and special variables
- Scripting and security
- Lucene expressions language
- Advanced scripts using script engines
Data management
- ILM: Manage the index lifecycle
- Tutorial: Customize built-in policies
- Tutorial: Automate rollover
- Index management in Kibana
- Overview
- Concepts
- Index lifecycle actions
  - Allocate
  - Delete
  - Force merge
  - Migrate
  - Read only
  - Rollover
  - Downsample
  - Searchable snapshot
  - Set priority
  - Shrink
  - Unfollow
  - Wait for snapshot
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Troubleshooting index lifecycle management errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
- Data tiers
Autoscaling
- Autoscaling deciders
Monitor a cluster
- Overview
- How it works
- Monitoring in a production environment
- Collecting monitoring data with Elastic Agent
- Collecting monitoring data with Metricbeat
- Collecting log data with Filebeat
- Configuring data streams/indices for monitoring
- Legacy collection methods
Roll up or transform your data
- Rolling up historical data
- Transforming data
Set up a cluster for high availability
- Designing for resilience
  - Resilience in small clusters
  - Resilience in larger clusters
- Cross-cluster replication
Snapshot and restore
- Register a repository
- Create a snapshot
- Restore a snapshot
- Searchable snapshots
Secure the Elastic Stack
- Elasticsearch security principles
- Start the Elastic Stack with security enabled automatically
- Manually configure security
- Updating node security certificates
  - With the same CA
  - With a different CA
- User authentication
- User authorization
- Enable audit logging
- Restricting connections with IP filtering
- Securing clients and integrations
- Operator privileges
- Troubleshooting
- Limitations
Watcher
- Getting started with Watcher
- How Watcher works
- Encrypting sensitive data in Watcher
- Inputs
- Triggers
  - Schedule trigger
- Conditions
- Actions
- Transforms
- Managing watches
- Example watches
  - Watching the status of an Elasticsearch cluster
- Limitations
Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-create-enrollment-token
- elasticsearch-croneval
- elasticsearch-keystore
- elasticsearch-node
- elasticsearch-reconfigure-node
- elasticsearch-reset-password
- elasticsearch-saml-metadata
- elasticsearch-service-tokens
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
How to
- General recommendations
- Recipes
- Tune for indexing speed
- Tune for search speed
- Tune approximate kNN search
- Tune for disk usage
- Size your shards
- Use Elasticsearch for time series data
Troubleshooting
- Fix common cluster issues
  - Watermark errors
  - Circuit breaker errors
  - High CPU usage
  - High JVM memory pressure
  - Red or yellow cluster status
  - Rejected requests
  - Task queue backlog
  - Mapping explosion
  - Hot spotting
- Diagnose unassigned shards
- Add a missing tier to the system
- Allow Elasticsearch to allocate the data in the system
- Allow Elasticsearch to allocate the index
- Indices mix index allocation filters with data tiers node roles to move through data tiers
- Not enough nodes to allocate all shard replicas
- Total number of shards for an index on a single node exceeded
- Total number of shards per node has been reached
- Troubleshooting corruption
- Fix data nodes out of disk
  - Increase the disk capacity of data nodes
  - Decrease the disk usage of data nodes
- Fix master nodes out of disk
- Fix other role nodes out of disk
- Start index lifecycle management
- Start Snapshot Lifecycle Management
- Restore from snapshot
- Multiple deployments writing to the same snapshot repository
- Addressing repeated snapshot policy failures
- Troubleshooting discovery
- Troubleshooting monitoring
- Troubleshooting transforms
- Troubleshooting Watcher
- Troubleshooting searches
- Troubleshooting shards capacity health issues
REST APIs
- API conventions
- Common options
- REST API compatibility
- Autoscaling APIs
  - Create or update autoscaling policy
  - Get autoscaling capacity
  - Delete autoscaling policy
  - Get autoscaling policy
- Behavioral Analytics APIs
  - Put Analytics Collection
  - Delete Analytics Collection
  - List Analytics Collections
  - Post Analytics Collection Event
- Compact and aligned text (CAT) APIs
  - cat aliases
  - cat allocation
  - cat anomaly detectors
  - cat component templates
  - cat count
  - cat data frame analytics
  - cat datafeeds
  - cat fielddata
  - cat health
  - cat indices
  - cat master
  - cat nodeattrs
  - cat nodes
  - cat pending tasks
  - cat plugins
  - cat recovery
  - cat repositories
  - cat segments
  - cat shards
  - cat snapshots
  - cat task management
  - cat templates
  - cat thread pool
  - cat trained model
  - cat transforms
- Cluster APIs
  - Cluster allocation explain
  - Cluster get settings
  - Cluster health
  - Health
  - Cluster reroute
  - Cluster state
  - Cluster stats
  - Cluster update settings
  - Nodes feature usage
  - Nodes hot threads
  - Nodes info
  - Prevalidate node removal
  - Nodes reload secure settings
  - Nodes stats
  - Pending cluster tasks
  - Remote cluster info
  - Task management
  - Voting configuration exclusions
  - Create or update desired nodes
  - Get desired nodes
  - Delete desired nodes
  - Get desired balance
  - Delete/reset desired balance
- Cross-cluster replication APIs
  - Get CCR stats
  - Create follower
  - Pause follower
  - Resume follower
  - Unfollow
  - Forget follower
  - Get follower stats
  - Get follower info
  - Create auto-follow pattern
  - Delete auto-follow pattern
  - Get auto-follow pattern
  - Pause auto-follow pattern
  - Resume auto-follow pattern
- Data stream APIs
  - Create data stream
  - Delete data stream
  - Get data stream
  - Migrate to data stream
  - Data stream stats
  - Promote data stream
  - Modify data streams
  - Downsample
- Document APIs
  - Reading and Writing documents
  - Index
  - Get
  - Delete
  - Delete by query
  - Update
  - Update by query
  - Multi get
  - Bulk
  - Reindex
  - Term vectors
  - Multi term vectors
  - ?refresh
  - Optimistic concurrency control
- Enrich APIs
  - Create enrich policy
  - Delete enrich policy
  - Get enrich policy
  - Execute enrich policy
  - Enrich stats
- EQL APIs
  - Delete async EQL search
  - EQL search
  - Get async EQL search
  - Get async EQL search status
- Features APIs
  - Get features
  - Reset features
- Fleet APIs
  - Get global checkpoints
  - Fleet search
  - Fleet search
- Find structure API
- Graph explore API
- Index APIs
  - Alias exists
  - Aliases
  - Analyze
  - Analyze index disk usage
  - Clear cache
  - Clone index
  - Close index
  - Create index
  - Create or update alias
  - Create or update component template
  - Create or update index template
  - Create or update index template (legacy)
  - Delete component template
  - Delete dangling index
  - Delete alias
  - Delete index
  - Delete index template
  - Delete index template (legacy)
  - Exists
  - Field usage stats
  - Flush
  - Force merge
  - Get alias
  - Get component template
  - Get field mapping
  - Get index
  - Get index settings
  - Get index template
  - Get index template (legacy)
  - Get mapping
  - Import dangling index
  - Index recovery
  - Index segments
  - Index shard stores
  - Index stats
  - Index template exists (legacy)
  - List dangling indices
  - Open index
  - Refresh
  - Resolve index
  - Rollover
  - Shrink index
  - Simulate index
  - Simulate template
  - Split index
  - Unfreeze index
  - Update index settings
  - Update mapping
- Index lifecycle management APIs
  - Create or update lifecycle policy
  - Get policy
  - Delete policy
  - Move to step
  - Remove policy
  - Retry policy
  - Get index lifecycle management status
  - Explain lifecycle
  - Start index lifecycle management
  - Stop index lifecycle management
  - Migrate indices, ILM policies, and legacy, composable and component templates to data tiers routing
- Ingest APIs
  - Create or update pipeline
  - Delete pipeline
  - GeoIP stats
  - Get pipeline
  - Simulate pipeline
- Info API
- Licensing APIs
  - Delete license
  - Get license
  - Get trial status
  - Start trial
  - Get basic status
  - Start basic
  - Update license
- Logstash APIs
  - Create or update Logstash pipeline
  - Delete Logstash pipeline
  - Get Logstash pipeline
- Machine learning APIs
  - Get machine learning info
  - Get machine learning memory stats
  - Set upgrade mode
- Machine learning anomaly detection APIs
  - Add events to calendar
  - Add jobs to calendar
  - Close jobs
  - Create jobs
  - Create calendars
  - Create datafeeds
  - Create filters
  - Delete calendars
  - Delete datafeeds
  - Delete events from calendar
  - Delete filters
  - Delete forecasts
  - Delete jobs
  - Delete jobs from calendar
  - Delete model snapshots
  - Delete expired data
  - Estimate model memory
  - Flush jobs
  - Forecast jobs
  - Get buckets
  - Get calendars
  - Get categories
  - Get datafeeds
  - Get datafeed statistics
  - Get influencers
  - Get jobs
  - Get job statistics
  - Get model snapshots
  - Get model snapshot upgrade statistics
  - Get overall buckets
  - Get scheduled events
  - Get filters
  - Get records
  - Open jobs
  - Post data to jobs
  - Preview datafeeds
  - Reset jobs
  - Revert model snapshots
  - Start datafeeds
  - Stop datafeeds
  - Update datafeeds
  - Update filters
  - Update jobs
  - Update model snapshots
  - Upgrade model snapshots
- Machine learning data frame analytics APIs
  - Create data frame analytics jobs
  - Delete data frame analytics jobs
  - Evaluate data frame analytics
  - Explain data frame analytics
  - Get data frame analytics jobs
  - Get data frame analytics jobs stats
  - Preview data frame analytics
  - Start data frame analytics jobs
  - Stop data frame analytics jobs
  - Update data frame analytics jobs
- Machine learning trained model APIs
  - Clear trained model deployment cache
  - Create or update trained model aliases
  - Create part of a trained model
  - Create trained models
  - Create trained model vocabulary
  - Delete trained model aliases
  - Delete trained models
  - Get trained models
  - Get trained models stats
  - Infer trained model
  - Start trained model deployment
  - Stop trained model deployment
  - Update trained model deployment
- Migration APIs
  - Deprecation info
  - Feature migration
- Node lifecycle APIs
  - Put shutdown API
  - Get shutdown API
  - Delete shutdown API
- Reload search analyzers API
- Repositories metering APIs
  - Get repositories metering information
  - Clear repositories metering archive
- Rollup APIs
  - Create rollup jobs
  - Delete rollup jobs
  - Get job
  - Get rollup caps
  - Get rollup index caps
  - Rollup search
  - Start rollup jobs
  - Stop rollup jobs
- Script APIs
  - Create or update stored script
  - Delete stored script
  - Get script contexts
  - Get script languages
  - Get stored script
- Search APIs
  - Search
  - Async search
  - Point in time
  - kNN search
  - Reciprocal rank fusion
  - Scroll
  - Clear scroll
  - Search template
  - Multi search template
  - Render search template
  - Search shards
  - Suggesters
  - Multi search
  - Count
  - Validate
  - Terms enum
  - Explain
  - Profile
  - Field capabilities
  - Ranking evaluation
  - Vector tile search
- Search Application APIs
  - Put Search Application
  - Get Search Application
  - List Search Applications
  - Delete Search Application
  - Search Application Search
- Searchable snapshots APIs
  - Mount snapshot
  - Cache stats
  - Searchable snapshot statistics
  - Clear cache
- Security APIs
  - Authenticate
  - Change passwords
  - Clear cache
  - Clear roles cache
  - Clear privileges cache
  - Clear API key cache
  - Clear service account token caches
  - Create API keys
  - Create or update application privileges
  - Create or update role mappings
  - Create or update roles
  - Create or update users
  - Create service account tokens
  - Delegate PKI authentication
  - Delete application privileges
  - Delete role mappings
  - Delete roles
  - Delete service account token
  - Delete users
  - Disable users
  - Enable users
  - Enroll Kibana
  - Enroll node
  - Get API key information
  - Get application privileges
  - Get builtin privileges
  - Get role mappings
  - Get roles
  - Get service accounts
  - Get service account credentials
  - Get token
  - Get user privileges
  - Get users
  - Grant API keys
  - Has privileges
  - Invalidate API key
  - Invalidate token
  - OpenID Connect prepare authentication
  - OpenID Connect authenticate
  - OpenID Connect logout
  - Query API key information
  - Update API key
  - Bulk update API keys
  - SAML prepare authentication
  - SAML authenticate
  - SAML logout
  - SAML invalidate
  - SAML complete logout
  - SAML service provider metadata
  - SSL certificate
  - Activate user profile
  - Disable user profile
  - Enable user profile
  - Get user profiles
  - Suggest user profile
  - Update user profile data
  - Has privileges user profile
- Snapshot and restore APIs
  - Create or update snapshot repository
  - Verify snapshot repository
  - Repository analysis
  - Get snapshot repository
  - Delete snapshot repository
  - Clean up snapshot repository
  - Clone snapshot
  - Create snapshot
  - Get snapshot
  - Get snapshot status
  - Restore snapshot
  - Delete snapshot
- Snapshot lifecycle management APIs
  - Create or update policy
  - Get policy
  - Delete policy
  - Execute snapshot lifecycle policy
  - Execute snapshot retention policy
  - Get snapshot lifecycle management status
  - Get snapshot lifecycle stats
  - Start snapshot lifecycle management
  - Stop snapshot lifecycle management
- SQL APIs
  - Clear SQL cursor
  - Delete async SQL search
  - Get async SQL search
  - Get async SQL search status
  - SQL search
  - SQL translate
- Transform APIs
  - Create transform
  - Delete transform
  - Get transforms
  - Get transform statistics
  - Preview transform
  - Reset transform
  - Schedule now transform
  - Start transform
  - Stop transforms
  - Update transform
  - Upgrade transforms
- Usage API
- Watcher APIs
  - Ack watch
  - Activate watch
  - Deactivate watch
  - Delete watch
  - Execute watch
  - Get watch
  - Get Watcher stats
  - Query watches
  - Create or update watch
  - Update Watcher settings
  - Get Watcher settings
  - Start watch service
  - Stop watch service
- Definitions
  - Role mapping resources
- Data Lifecycle Management APIs
  - Put Data Stream Lifecycle
  - Get Data Stream Lifecycle
  - Delete Data Stream Lifecycle
  - Explain Data Lifecycle
Migration guide
- 8.8
- 8.7
- 8.6
- 8.5
- 8.4
- 8.3
- 8.2
- 8.1
- 8.0
  - Java time migration guide
  - Transient settings migration guide
Release notes
- Elasticsearch version 8.8.2
- Elasticsearch version 8.8.1
- Elasticsearch version 8.8.0
- Elasticsearch version 8.7.1
- Elasticsearch version 8.7.0
- Elasticsearch version 8.6.2
- Elasticsearch version 8.6.1
- Elasticsearch version 8.6.0
- Elasticsearch version 8.5.3
- Elasticsearch version 8.5.2
- Elasticsearch version 8.5.1
- Elasticsearch version 8.5.0
- Elasticsearch version 8.4.3
- Elasticsearch version 8.4.2
- Elasticsearch version 8.4.1
- Elasticsearch version 8.4.0
- Elasticsearch version 8.3.3
- Elasticsearch version 8.3.2
- Elasticsearch version 8.3.1
- Elasticsearch version 8.3.0
- Elasticsearch version 8.2.3
- Elasticsearch version 8.2.2
- Elasticsearch version 8.2.1
- Elasticsearch version 8.2.0
- Elasticsearch version 8.1.3
- Elasticsearch version 8.1.2
- Elasticsearch version 8.1.1
- Elasticsearch version 8.1.0
- Elasticsearch version 8.0.1
- Elasticsearch version 8.0.0
- Elasticsearch version 8.0.0-rc2
- Elasticsearch version 8.0.0-rc1
- Elasticsearch version 8.0.0-beta1
- Elasticsearch version 8.0.0-alpha2
- Elasticsearch version 8.0.0-alpha1
Dependencies and versions

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

› › ›

Networking

edit

Networking

edit

Each Elasticsearch node has two different network interfaces. Clients send requests to Elasticsearch’s REST APIs using its HTTP interface, but nodes communicate with other nodes using the transport interface. The transport interface is also used for communication with remote clusters.

You can configure both of these interfaces at the same time using the network.* settings. If you have a more complicated network, you might need to configure the interfaces independently using the http.* and transport.* settings. Where possible, use the network.* settings that apply to both interfaces to simplify your configuration and reduce duplication.

By default Elasticsearch binds only to localhost which means it cannot be accessed remotely. This configuration is sufficient for a local development cluster made of one or more nodes all running on the same host. To form a cluster across multiple hosts, or which is accessible to remote clients, you must adjust some network settings such as network.host.

Be careful with the network configuration!

Never expose an unprotected node to the public internet. If you do, you are permitting anyone in the world to download, modify, or delete any of the data in your cluster.

Configuring Elasticsearch to bind to a non-local address will convert some warnings into fatal exceptions. If a node refuses to start after configuring its network settings then you must address the logged exceptions before proceeding.

Commonly used network settings

edit

Most users will need to configure only the following network settings.

network.host

(Static, string) Sets the address of this node for both HTTP and transport traffic. The node will bind to this address and will also use it as its publish address. Accepts an IP address, a hostname, or a special value.

Defaults to _local_.

http.port

(Static, integer) The port to bind for HTTP client communication. Accepts a single value or a range. If a range is specified, the node will bind to the first available port in the range.

Defaults to 9200-9300.

transport.port

(Static, integer) The port to bind for communication between nodes. Accepts a single value or a range. If a range is specified, the node will bind to the first available port in the range. Set this setting to a single port, not a range, on every master-eligible node.

Defaults to 9300-9400.

Special values for network addresses

edit

You can configure Elasticsearch to automatically determine its addresses by using the following special values. Use these values when configuring network.host, network.bind_host, network.publish_host, and the corresponding settings for the HTTP and transport interfaces.

_local_: Any loopback addresses on the system, for example 127.0.0.1.
_site_: Any site-local addresses on the system, for example 192.168.0.1.
_global_: Any globally-scoped addresses on the system, for example 8.8.8.8.
_[networkInterface]_: Use the addresses of the network interface called [networkInterface]. For example if you wish to use the addresses of an interface called en0 then set network.host: _en0_.
0.0.0.0: The addresses of all available network interfaces.

In some systems these special values resolve to multiple addresses. If so, Elasticsearch will select one of them as its publish address and may change its selection on each node restart. Ensure your node is accessible at every possible address.

Any values containing a : (e.g. an IPv6 address or some of the special values) must be quoted because : is a special character in YAML.

IPv4 vs IPv6

edit

These special values yield both IPv4 and IPv6 addresses by default, but you can also add an :ipv4 or :ipv6 suffix to limit them to just IPv4 or IPv6 addresses respectively. For example, network.host: "_en0:ipv4_" would set this node’s addresses to the IPv4 addresses of interface en0.

Discovery in the Cloud

More special settings are available when running in the Cloud with either the EC2 discovery plugin or the Google Compute Engine discovery plugin installed.

Binding and publishing

edit

Elasticsearch uses network addresses for two distinct purposes known as binding and publishing. Most nodes will use the same address for everything, but more complicated setups may need to configure different addresses for different purposes.

When an application such as Elasticsearch wishes to receive network communications, it must indicate to the operating system the address or addresses whose traffic it should receive. This is known as binding to those addresses. Elasticsearch can bind to more than one address if needed, but most nodes only bind to a single address. Elasticsearch can only bind to an address if it is running on a host that has a network interface with that address. If necessary, you can configure the transport and HTTP interfaces to bind to different addresses.

Each Elasticsearch node has an address at which clients and other nodes can contact it, known as its publish address. Each node has one publish address for its HTTP interface and one for its transport interface. These two addresses can be anything, and don’t need to be addresses of the network interfaces on the host. The only requirements are that each node must be:

Accessible at its HTTP publish address by all clients that will discover it using sniffing.
Accessible at its transport publish address by all other nodes in its cluster, and by any remote clusters that will discover it using Sniff mode.

If you specify the transport publish address using a hostname then Elasticsearch will resolve this hostname to an IP address once during startup, and other nodes will use the resulting IP address instead of resolving the name again themselves. To avoid confusion, use a hostname which resolves to the node’s address in all network locations.

Using a single address

edit

The most common configuration is for Elasticsearch to bind to a single address at which it is accessible to clients and other nodes. In this configuration you should just set network.host to that address. You should not separately set any bind or publish addresses, nor should you separately configure the addresses for the HTTP or transport interfaces.

Using multiple addresses

edit

Use the advanced network settings if you wish to bind Elasticsearch to multiple addresses, or to publish a different address from the addresses to which you are binding. Set network.bind_host to the bind addresses, and network.publish_host to the address at which this node is exposed. In complex configurations, you can configure these addresses differently for the HTTP and transport interfaces.

Advanced network settings

edit

These advanced settings let you bind to multiple addresses, or to use different addresses for binding and publishing. They are not required in most cases and you should not use them if you can use the commonly used settings instead.

network.bind_host: (Static, string) The network address(es) to which the node should bind in order to listen for incoming connections. Accepts a list of IP addresses, hostnames, and special values. Defaults to the address given by network.host. Use this setting only if binding to multiple addresses or using different addresses for publishing and binding.
network.publish_host: (Static, string) The network address that clients and other nodes can use to contact this node. Accepts an IP address, a hostname, or a special value. Defaults to the address given by network.host. Use this setting only if binding to multiple addresses or using different addresses for publishing and binding.

You can specify a list of addresses for network.host and network.publish_host. You can also specify one or more hostnames or special values that resolve to multiple addresses. If you do this then Elasticsearch chooses one of the addresses for its publish address. This choice uses heuristics based on IPv4/IPv6 stack preference and reachability and may change when the node restarts. Ensure each node is accessible at all possible publish addresses.

Advanced TCP settings

edit

Use the following settings to control the low-level parameters of the TCP connections used by the HTTP and transport interfaces.

network.tcp.keep_alive: (Static, boolean) Configures the SO_KEEPALIVE option for network sockets, which determines whether each connection sends TCP keepalive probes. Defaults to true.
network.tcp.keep_idle: (Static, integer) Configures the TCP_KEEPIDLE option for network sockets, which determines the time in seconds that a connection must be idle before starting to send TCP keepalive probes. Defaults to -1, which means to use the system default. This value cannot exceed 300 seconds. Only applicable on Linux and macOS.
network.tcp.keep_interval: (Static, integer) Configures the TCP_KEEPINTVL option for network sockets, which determines the time in seconds between sending TCP keepalive probes. Defaults to -1, which means to use the system default. This value cannot exceed 300 seconds. Only applicable on Linux and macOS.
network.tcp.keep_count: (Static, integer) Configures the TCP_KEEPCNT option for network sockets, which determines the number of unacknowledged TCP keepalive probes that may be sent on a connection before it is dropped. Defaults to -1, which means to use the system default. Only applicable on Linux and macOS.
network.tcp.no_delay: (Static, boolean) Configures the TCP_NODELAY option on network sockets, which determines whether TCP no delay is enabled. Defaults to true.
network.tcp.reuse_address: (Static, boolean) Configures the SO_REUSEADDR option for network sockets, which determines whether the address can be reused or not. Defaults to false on Windows and true otherwise.
network.tcp.send_buffer_size: (Static, byte value) Configures the size of the TCP send buffer for network sockets. Defaults to -1 which means to use the system default.
network.tcp.receive_buffer_size: (Static, byte value) Configures the size of the TCP receive buffer. Defaults to -1 which means to use the system default.

Advanced HTTP settings

edit

Use the following advanced settings to configure the HTTP interface independently of the transport interface. You can also configure both interfaces together using the network settings.

http.host

(Static, string) Sets the address of this node for HTTP traffic. The node will bind to this address and will also use it as its HTTP publish address. Accepts an IP address, a hostname, or a special value. Use this setting only if you require different configurations for the transport and HTTP interfaces.

Defaults to the address given by network.host.

http.bind_host

(Static, string) The network address(es) to which the node should bind in order to listen for incoming HTTP connections. Accepts a list of IP addresses, hostnames, and special values. Defaults to the address given by http.host or network.bind_host. Use this setting only if you require to bind to multiple addresses or to use different addresses for publishing and binding, and you also require different binding configurations for the transport and HTTP interfaces.

http.publish_host

(Static, string) The network address for HTTP clients to contact the node using sniffing. Accepts an IP address, a hostname, or a special value. Defaults to the address given by http.host or network.publish_host. Use this setting only if you require to bind to multiple addresses or to use different addresses for publishing and binding, and you also require different binding configurations for the transport and HTTP interfaces.

http.publish_port

(Static, integer) The port of the HTTP publish address. Configure this setting only if you need the publish port to be different from http.port. Defaults to the port assigned via http.port.

http.max_content_length

(Static, byte value) Maximum size of an HTTP request body. If the body is compressed, the limit applies to the HTTP request body size before compression. Defaults to 100mb. Configuring this setting to greater than 100mb can cause cluster instability and is not recommended. If you hit this limit when sending a request to the Bulk API, configure your client to send fewer documents in each bulk request. If you wish to index individual documents that exceed 100mb, pre-process them into smaller documents before sending them to Elasticsearch. For instance, store the raw data in a system outside Elasticsearch and include a link to the raw data in the documents that Elasticsearch indexes.

http.max_initial_line_length

(Static, byte value) Maximum size of an HTTP URL. Defaults to 4kb.

http.max_header_size

(Static, byte value) Maximum size of allowed headers. Defaults to 16kb.

http.compression

(Static, boolean) Support for compression when possible (with Accept-Encoding). If HTTPS is enabled, defaults to false. Otherwise, defaults to true.

Disabling compression for HTTPS mitigates potential security risks, such as a BREACH attack. To compress HTTPS traffic, you must explicitly set http.compression to true.

http.compression_level

(Static, integer) Defines the compression level to use for HTTP responses. Valid values are in the range of 1 (minimum compression) and 9 (maximum compression). Defaults to 3.

http.cors.enabled: (Static, boolean) Enable or disable cross-origin resource sharing, which determines whether a browser on another origin can execute requests against Elasticsearch. Set to true to enable Elasticsearch to process pre-flight CORS requests. Elasticsearch will respond to those requests with the Access-Control-Allow-Origin header if the Origin sent in the request is permitted by the http.cors.allow-origin list. Set to false (the default) to make Elasticsearch ignore the Origin request header, effectively disabling CORS requests because Elasticsearch will never respond with the Access-Control-Allow-Origin response header.

If the client does not send a pre-flight request with an Origin header or it does not check the response headers from the server to validate the Access-Control-Allow-Origin response header, then cross-origin security is compromised. If CORS is not enabled on Elasticsearch, the only way for the client to know is to send a pre-flight request and realize the required response headers are missing.

http.cors.allow-origin: (Static, string) Which origins to allow. If you prepend and append a forward slash (/) to the value, this will be treated as a regular expression, allowing you to support HTTP and HTTPs. For example, using /https?:\/\/localhost(:[0-9]+)?/ would return the request header appropriately in both cases. Defaults to no origins allowed.

A wildcard (*) is a valid value but is considered a security risk, as your Elasticsearch instance is open to cross origin requests from anywhere.

http.cors.max-age: (Static, integer) Browsers send a "preflight" OPTIONS-request to determine CORS settings. max-age defines for how long, in seconds, the result should be cached. Defaults to 1728000 (20 days).

http.cors.allow-methods: (Static, string) Which methods to allow. Defaults to OPTIONS, HEAD, GET, POST, PUT, DELETE.

http.cors.allow-headers: (Static, string) Which headers to allow. Defaults to X-Requested-With, Content-Type, Content-Length, Authorization, Accept, User-Agent, X-Elastic-Client-Meta.

http.cors.expose-headers: (Static) Which response headers to expose in the client. Defaults to X-elastic-product.

http.cors.allow-credentials: (Static, boolean) Whether the Access-Control-Allow-Credentials header should be returned. Defaults to false.

This header is only returned when the setting is set to true.

http.detailed_errors.enabled: (Static, boolean) Configures whether detailed error reporting in HTTP responses is enabled. Defaults to true, which means that HTTP requests that include the ?error_trace parameter will return a detailed error message including a stack trace if they encounter an exception. If set to false, requests with the ?error_trace parameter are rejected.
http.pipelining.max_events: (Static, integer) The maximum number of events to be queued up in memory before an HTTP connection is closed, defaults to 10000.
http.max_warning_header_count: (Static, integer) The maximum number of warning headers in client HTTP responses. Defaults to -1 which means the number of warning headers is unlimited.
http.max_warning_header_size: (Static, byte value) The maximum total size of warning headers in client HTTP responses. Defaults to -1 which means the size of the warning headers is unlimited.
http.tcp.keep_alive: (Static, boolean) Configures the SO_KEEPALIVE option for this socket, which determines whether it sends TCP keepalive probes. Defaults to network.tcp.keep_alive.
http.tcp.keep_idle: (Static, integer) Configures the TCP_KEEPIDLE option for HTTP sockets, which determines the time in seconds that a connection must be idle before starting to send TCP keepalive probes. Defaults to network.tcp.keep_idle, which uses the system default. This value cannot exceed 300 seconds. Only applicable on Linux and macOS.
http.tcp.keep_interval: (Static, integer) Configures the TCP_KEEPINTVL option for HTTP sockets, which determines the time in seconds between sending TCP keepalive probes. Defaults to network.tcp.keep_interval, which uses the system default. This value cannot exceed 300 seconds. Only applicable on Linux and macOS.
http.tcp.keep_count: (Static, integer) Configures the TCP_KEEPCNT option for HTTP sockets, which determines the number of unacknowledged TCP keepalive probes that may be sent on a connection before it is dropped. Defaults to network.tcp.keep_count, which uses the system default. Only applicable on Linux and macOS.
http.tcp.no_delay: (Static, boolean) Configures the TCP_NODELAY option on HTTP sockets, which determines whether TCP no delay is enabled. Defaults to true.
http.tcp.reuse_address: (Static, boolean) Configures the SO_REUSEADDR option for HTTP sockets, which determines whether the address can be reused or not. Defaults to false on Windows and true otherwise.
http.tcp.send_buffer_size: (Static, byte value) The size of the TCP send buffer for HTTP traffic. Defaults to network.tcp.send_buffer_size.
http.tcp.receive_buffer_size: (Static, byte value) The size of the TCP receive buffer for HTTP traffic. Defaults to network.tcp.receive_buffer_size.
http.client_stats.enabled: (Dynamic, boolean) Enable or disable collection of HTTP client stats. Defaults to true.
http.client_stats.closed_channels.max_count: (Static, integer) When http.client_stats.enabled is true, sets the maximum number of closed HTTP channels for which Elasticsearch reports statistics. Defaults to 10000.
http.client_stats.closed_channels.max_age: (Static, time value) When http.client_stats.enabled is true, sets the maximum length of time after closing a HTTP channel that Elasticsearch will report that channel’s statistics. Defaults to 5m.

Advanced transport settings

edit

Use the following advanced settings to configure the transport interface independently of the HTTP interface. Use the network settings to configure both interfaces together.

transport.host

(Static, string) Sets the address of this node for transport traffic. The node will bind to this address and will also use it as its transport publish address. Accepts an IP address, a hostname, or a special value. Use this setting only if you require different configurations for the transport and HTTP interfaces.

Defaults to the address given by network.host.

transport.bind_host

(Static, string) The network address(es) to which the node should bind in order to listen for incoming transport connections. Accepts a list of IP addresses, hostnames, and special values. Defaults to the address given by transport.host or network.bind_host. Use this setting only if you require to bind to multiple addresses or to use different addresses for publishing and binding, and you also require different binding configurations for the transport and HTTP interfaces.

transport.publish_host

(Static, string) The network address at which the node can be contacted by other nodes. Accepts an IP address, a hostname, or a special value. Defaults to the address given by transport.host or network.publish_host. Use this setting only if you require to bind to multiple addresses or to use different addresses for publishing and binding, and you also require different binding configurations for the transport and HTTP interfaces.

transport.publish_port

(Static, integer) The port of the transport publish address. Set this parameter only if you need the publish port to be different from transport.port. Defaults to the port assigned via transport.port.

transport.connect_timeout

(Static, time value) The connect timeout for initiating a new connection (in time setting format). Defaults to 30s.

transport.compress

(Static, string) Set to true, indexing_data, or false to configure transport compression between nodes. The option true will compress all data. The option indexing_data will compress only the raw index data sent between nodes during ingest, ccr following (excluding bootstrap), and operations based shard recovery (excluding transferring lucene files). Defaults to indexing_data.

transport.compression_scheme

(Static, string) Configures the compression scheme for transport.compress. The options are deflate or lz4. If lz4 is configured and the remote node has not been upgraded to a version supporting lz4, the traffic will be sent uncompressed. Defaults to lz4.

transport.tcp.keep_alive

(Static, boolean) Configures the SO_KEEPALIVE option for transport sockets, which determines whether they send TCP keepalive probes. Defaults to network.tcp.keep_alive.

transport.tcp.keep_idle

(Static, integer) Configures the TCP_KEEPIDLE option for transport sockets, which determines the time in seconds that a connection must be idle before starting to send TCP keepalive probes. Defaults to network.tcp.keep_idle if set, or the system default otherwise. This value cannot exceed 300 seconds. In cases where the system default is higher than 300, the value is automatically lowered to 300. Only applicable on Linux and macOS.

transport.tcp.keep_interval

(Static, integer) Configures the TCP_KEEPINTVL option for transport sockets, which determines the time in seconds between sending TCP keepalive probes. Defaults to network.tcp.keep_interval if set, or the system default otherwise. This value cannot exceed 300 seconds. In cases where the system default is higher than 300, the value is automatically lowered to 300. Only applicable on Linux and macOS.

transport.tcp.keep_count

(Static, integer) Configures the TCP_KEEPCNT option for transport sockets, which determines the number of unacknowledged TCP keepalive probes that may be sent on a connection before it is dropped. Defaults to network.tcp.keep_count if set, or the system default otherwise. Only applicable on Linux and macOS.

transport.tcp.no_delay

(Static, boolean) Configures the TCP_NODELAY option on transport sockets, which determines whether TCP no delay is enabled. Defaults to true.

transport.tcp.reuse_address

(Static, boolean) Configures the SO_REUSEADDR option for network sockets, which determines whether the address can be reused or not. Defaults to network.tcp.reuse_address.

transport.tcp.send_buffer_size

(Static, byte value) The size of the TCP send buffer for transport traffic. Defaults to network.tcp.send_buffer_size.

transport.tcp.receive_buffer_size

(Static, byte value) The size of the TCP receive buffer for transport traffic. Defaults to network.tcp.receive_buffer_size.

transport.ping_schedule

(Static, time value) Configures the time between sending application-level pings on all transport connections to promptly detect when a transport connection has failed. Defaults to -1 meaning that application-level pings are not sent. You should use TCP keepalives (see transport.tcp.keep_alive) instead of application-level pings wherever possible.

Transport profiles

edit

Elasticsearch allows you to bind to multiple ports on different interfaces by the use of transport profiles. See this example configuration

transport.profiles.default.port: 9300-9400
transport.profiles.default.bind_host: 10.0.0.1
transport.profiles.client.port: 9500-9600
transport.profiles.client.bind_host: 192.168.0.1
transport.profiles.dmz.port: 9700-9800
transport.profiles.dmz.bind_host: 172.16.1.2

The default profile is special. It is used as a fallback for any other profiles, if those do not have a specific configuration setting set, and is how this node connects to other nodes in the cluster. Other profiles can have any name and can be used to set up specific endpoints for incoming connections.

The following parameters can be configured on each transport profile, as in the example above:

port: The port to which to bind.
bind_host: The host to which to bind.
publish_host: The host which is published in informational APIs.

Profiles also support all the other transport settings specified in the transport settings section, and use these as defaults. For example, transport.profiles.client.tcp.reuse_address can be explicitly configured, and defaults otherwise to transport.tcp.reuse_address.

Long-lived idle connections

edit

A transport connection between two nodes is made up of a number of long-lived TCP connections, some of which may be idle for an extended period of time. Nonetheless, Elasticsearch requires these connections to remain open, and it can disrupt the operation of your cluster if any inter-node connections are closed by an external influence such as a firewall. It is important to configure your network to preserve long-lived idle connections between Elasticsearch nodes, for instance by leaving *.tcp.keep_alive enabled and ensuring that the keepalive interval is shorter than any timeout that might cause idle connections to be closed, or by setting transport.ping_schedule if keepalives cannot be configured. Devices which drop connections when they reach a certain age are a common source of problems to Elasticsearch clusters, and must not be used.

Request compression

edit

The default transport.compress configuration option indexing_data will only compress requests that relate to the transport of raw indexing source data between nodes. This option primarily compresses data sent during ingest, ccr, and shard recovery. This default normally makes sense for local cluster communication as compressing raw documents tends significantly reduce inter-node network usage with minimal CPU impact.

The transport.compress setting always configures local cluster request compression and is the fallback setting for remote cluster request compression. If you want to configure remote request compression differently than local request compression, you can set it on a per-remote cluster basis using the cluster.remote.${cluster_alias}.transport.compress setting.

Response compression

edit

The compression settings do not configure compression for responses. Elasticsearch will compress a response if the inbound request was compressed—even when compression is not enabled. Similarly, Elasticsearch will not compress a response if the inbound request was uncompressed—even when compression is enabled. The compression scheme used to compress a response will be the same scheme the remote node used to compress the request.

Request tracing

edit

You can trace individual requests made on the HTTP and transport layers.

Tracing can generate extremely high log volumes that can destabilize your cluster. Do not enable request tracing on busy or important clusters.

REST request tracer

edit

The HTTP layer has a dedicated tracer that logs incoming requests and the corresponding outgoing responses. Activate the tracer by setting the level of the org.elasticsearch.http.HttpTracer logger to TRACE:

response = client.cluster.put_settings(
  body: {
    persistent: {
      "logger.org.elasticsearch.http.HttpTracer": 'TRACE'
    }
  }
)
puts response

PUT _cluster/settings
{
   "persistent" : {
      "logger.org.elasticsearch.http.HttpTracer" : "TRACE"
   }
}

Copy as curl Try in Elastic

You can also control which URIs will be traced, using a set of include and exclude wildcard patterns. By default every request will be traced.

response = client.cluster.put_settings(
  body: {
    persistent: {
      "http.tracer.include": '*',
      "http.tracer.exclude": ''
    }
  }
)
puts response

PUT _cluster/settings
{
   "persistent" : {
      "http.tracer.include" : "*",
      "http.tracer.exclude" : ""
   }
}

Copy as curl Try in Elastic

By default, the tracer logs a summary of each request and response which matches these filters. To record the body of each request and response too, set the system property es.insecure_network_trace_enabled to true, and then set the levels of both the org.elasticsearch.http.HttpTracer and org.elasticsearch.http.HttpBodyTracer loggers to TRACE:

response = client.cluster.put_settings(
  body: {
    persistent: {
      "logger.org.elasticsearch.http.HttpTracer": 'TRACE',
      "logger.org.elasticsearch.http.HttpBodyTracer": 'TRACE'
    }
  }
)
puts response

PUT _cluster/settings
{
   "persistent" : {
      "logger.org.elasticsearch.http.HttpTracer" : "TRACE",
      "logger.org.elasticsearch.http.HttpBodyTracer" : "TRACE"
   }
}

Copy as curl Try in Elastic

Each message body is compressed, encoded, and split into chunks to avoid truncation:

[TRACE][o.e.h.HttpBodyTracer     ] [master] [276] response body [part 1]: H4sIAAAAAAAA/9...
[TRACE][o.e.h.HttpBodyTracer     ] [master] [276] response body [part 2]: 2oJ93QyYLWWhcD...
[TRACE][o.e.h.HttpBodyTracer     ] [master] [276] response body (gzip compressed, base64-encoded, and split into 2 parts on preceding log lines)

Each chunk is annotated with an internal request ID ([276] in this example) which you should use to correlate the chunks with the corresponding summary lines. To reconstruct the output, base64-decode the data and decompress it using gzip. For instance, on Unix-like systems:

cat httptrace.log | sed -e 's/.*://' | base64 --decode | gzip --decompress

HTTP request and response bodies may contain sensitive information such as credentials and keys, so HTTP body tracing is disabled by default. You must explicitly enable it on each node by setting the system property es.insecure_network_trace_enabled to true. This feature is primarily intended for test systems which do not contain any sensitive information. If you set this property on a system which contains sensitive information, you must protect your logs from unauthorized access.

Transport tracer

edit

The transport layer has a dedicated tracer that logs incoming and outgoing requests and responses. Activate the tracer by setting the level of the org.elasticsearch.transport.TransportService.tracer logger to TRACE:

response = client.cluster.put_settings(
  body: {
    persistent: {
      "logger.org.elasticsearch.transport.TransportService.tracer": 'TRACE'
    }
  }
)
puts response

PUT _cluster/settings
{
   "persistent" : {
      "logger.org.elasticsearch.transport.TransportService.tracer" : "TRACE"
   }
}

Copy as curl Try in Elastic

You can also control which actions will be traced, using a set of include and exclude wildcard patterns. By default every request will be traced except for fault detection pings:

response = client.cluster.put_settings(
  body: {
    persistent: {
      "transport.tracer.include": '*',
      "transport.tracer.exclude": 'internal:coordination/fault_detection/*'
    }
  }
)
puts response

PUT _cluster/settings
{
   "persistent" : {
      "transport.tracer.include" : "*",
      "transport.tracer.exclude" : "internal:coordination/fault_detection/*"
   }
}

Copy as curl Try in Elastic

Networking threading model

edit

This section describes the threading model used by the networking subsystem in Elasticsearch. This information isn’t required to use Elasticsearch, but it may be useful to advanced users who are diagnosing network problems in a cluster.

Elasticsearch nodes communicate over a collection of TCP channels that together form a transport connection. Elasticsearch clients communicate with the cluster over HTTP, which also uses one or more TCP channels. Each of these TCP channels is owned by exactly one of the transport_worker threads in the node. This owning thread is chosen when the channel is opened and remains the same for the lifetime of the channel.

Each transport_worker thread has sole responsibility for sending and receiving data over the channels it owns. Additionally, each http and transport server socket is assigned to one of the transport_worker threads. That worker has the responsibility of accepting new incoming connections to the server socket it owns.

If a thread in Elasticsearch wants to send data over a particular channel, it passes the data to the owning transport_worker thread for the actual transmission.

Normally the transport_worker threads will not completely handle the messages they receive. Instead, they will do a small amount of preliminary processing and then dispatch (hand off) the message to a different threadpool for the rest of their handling. For instance, bulk messages are dispatched to the write threadpool, searches are dispatched to one of the search threadpools, and requests for statistics and other management tasks are mostly dispatched to the management threadpool. However in some cases the processing of a message is expected to be so quick that Elasticsearch will do all of the processing on the transport_worker thread rather than incur the overhead of dispatching it elsewhere.

By default, there is one transport_worker thread per CPU. In contrast, there may sometimes be tens-of-thousands of TCP channels. If data arrives on a TCP channel and its owning transport_worker thread is busy, the data isn’t processed until the thread finishes whatever it is doing. Similarly, outgoing data are not sent over a channel until the owning transport_worker thread is free. This means that we require every transport_worker thread to be idle frequently. An idle transport_worker looks something like this in a stack dump:

"elasticsearch[instance-0000000004][transport_worker][T#1]" #32 daemon prio=5 os_prio=0 cpu=9645.94ms elapsed=501.63s tid=0x00007fb83b6307f0 nid=0x1c4 runnable  [0x00007fb7b8ffe000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPoll.wait(java.base@17.0.2/Native Method)
	at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17.0.2/EPollSelectorImpl.java:118)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17.0.2/SelectorImpl.java:129)
	- locked <0x00000000c443c518> (a sun.nio.ch.Util$2)
	- locked <0x00000000c38f7700> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(java.base@17.0.2/SelectorImpl.java:146)
	at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:813)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.lang.Thread.run(java.base@17.0.2/Thread.java:833)

In the Nodes hot threads API an idle transport_worker thread is reported like this:

   0.0% [cpu=0.0%, idle=100.0%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000004][transport_worker][T#1]'
     10/10 snapshots sharing following 9 elements
       java.base@17.0.2/sun.nio.ch.EPoll.wait(Native Method)
       java.base@17.0.2/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:118)
       java.base@17.0.2/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:129)
       java.base@17.0.2/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:146)
       io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:813)
       io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@17.0.2/java.lang.Thread.run(Thread.java:833)

Note that transport_worker threads should always be in state RUNNABLE, even when waiting for input, because they block in the native EPoll#wait method. The idle= time reports the proportion of time the thread spent waiting for input, whereas the cpu= time reports the proportion of time the thread spent processing input it has received. If the thread was seen using absolutely no CPU then it will report 0.0% [cpu=0.0%, idle=0.0%].

If a transport_worker thread is not frequently idle, it may build up a backlog of work. This can cause delays in processing messages on the channels that it owns. It’s hard to predict exactly which work will be delayed:

There are many more channels than threads. If work related to one channel is causing delays to its worker thread, all other channels owned by that thread will also suffer delays.
The mapping from TCP channels to worker threads is fixed but arbitrary. Each channel is assigned an owning thread in a round-robin fashion when the channel is opened. Each worker thread is responsible for many different kinds of channel.
There are many channels open between each pair of nodes. For each request, Elasticsearch will choose from the appropriate channels in a round-robin fashion. Some requests may end up on a channel owned by a delayed worker while other identical requests will be sent on a channel that’s working smoothly.

If the backlog builds up too far, some messages may be delayed by many seconds. The node might even fail its health checks and be removed from the cluster. Sometimes, you can find evidence of busy transport_worker threads using the Nodes hot threads API. However, this API itself sends network messages so may not work correctly if the transport_worker threads are too busy. It is more reliable to use jstack to obtain stack dumps or use Java Flight Recorder to obtain a profiling trace. These tools are independent of any work the JVM is performing.

« Node Node query cache settings »

On this page

Commonly used network settings
Special values for network addresses
IPv4 vs IPv6
Binding and publishing
Using a single address
Using multiple addresses
Advanced network settings
Advanced TCP settings
Advanced HTTP settings
Advanced transport settings
Transport profiles
Long-lived idle connections
Request compression
Response compression
Request tracing
REST request tracer
Transport tracer
Networking threading model

Was this helpful?

Feedback

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

Networking

Networking

Commonly used network settings

Special values for network addresses

IPv4 vs IPv6

Binding and publishing

Using a single address

Using multiple addresses

Advanced network settings

Advanced TCP settings

Advanced HTTP settings

Advanced transport settings

Transport profiles

Long-lived idle connections

Request compression

Response compression

Request tracing

REST request tracer

Transport tracer

Networking threading model

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards