Elasticsearch Guide: other versions:
What is Elasticsearch?
- Data in: documents and indices
- Information out: search and analyze
- Scalability and resilience
What’s new in 7.16
Quick start
Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important system configuration
- Bootstrap Checks
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Set up X-Pack
- Configuring X-Pack Java Clients
- Plugins
Upgrade Elasticsearch
- Rolling upgrades
- Full cluster restart upgrade
- Reindex before upgrading
  - Reindex in place
  - Reindex from a remote cluster
- Archived settings
Index modules
- Analysis
- Index Shard Allocation
- Index blocks
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Preloading data into the file system cache
- Translog
- History retention
- Index Sorting
  - Use index sorting to speed up conjunctions
- Indexing pressure
Mapping
- Dynamic mapping
  - Dynamic field mapping
  - Dynamic templates
- Explicit mapping
- Runtime fields
- Field data types
  - Aggregate metric
  - Alias
  - Arrays
  - Binary
  - Boolean
  - Date
  - Date nanoseconds
  - Dense vector
  - Flattened
  - Geopoint
  - Geoshape
  - Histogram
  - IP
  - Join
  - Keyword
  - Nested
  - Numeric
  - Object
  - Percolator
  - Point
  - Range
  - Rank feature
  - Rank features
  - Search-as-you-type
  - Shape
  - Sparse vector
  - Text
  - Token count
  - Unsigned long
  - Version
- Metadata fields
- Mapping parameters
- Mapping limit settings
- Removal of mapping types
Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
  - Fingerprint
  - Keyword
  - Language
  - Pattern
  - Simple
  - Standard
  - Stop
  - Whitespace
- Tokenizer reference
  - Character group
  - Classic
  - Edge n-gram
  - Keyword
  - Letter
  - Lowercase
  - N-gram
  - Path hierarchy
  - Pattern
  - Simple pattern
  - Simple pattern split
  - Standard
  - Thai
  - UAX URL email
  - Whitespace
- Token filter reference
- Character filters reference
- Normalizers
Index templates
- Simulate multi-component templates
Data streams
- Set up a data stream
- Use a data stream
- Change mappings and settings for a data stream
Ingest pipelines
- Example: Parse logs
- Enrich your data
- Processor reference
  - Append
  - Bytes
  - Circle
  - Community ID
  - Convert
  - CSV
  - Date
  - Date index name
  - Dissect
  - Dot expander
  - Drop
  - Enrich
  - Fail
  - Fingerprint
  - Foreach
  - GeoIP
  - Grok
  - Gsub
  - HTML strip
  - Inference
  - Join
  - JSON
  - KV
  - Lowercase
  - Network direction
  - Pipeline
  - Registered domain
  - Remove
  - Rename
  - Script
  - Set
  - Set security user
  - Sort
  - Split
  - Trim
  - Uppercase
  - URL decode
  - URI parts
  - User agent
Aliases
Search your data
- Collapse search results
- Filter search results
- Highlighting
- Long-running searches
- Near real-time search
- Paginate search results
- Retrieve inner hits
- Retrieve selected fields
- Search across clusters
- Search multiple data streams and indices
- Search shard routing
- Search templates
- Sort search results
Query DSL
- Query and filter context
- Compound queries
- Full text queries
- Geo queries
- Shape queries
  - Shape
- Joining queries
  - Nested
  - Has child
  - Has parent
  - Parent ID
- Match all
- Span queries
- Specialized queries
- Term-level queries
  - Exists
  - Fuzzy
  - IDs
  - Prefix
  - Range
  - Regexp
  - Term
  - Terms
  - Terms set
  - Type Query
  - Wildcard
- minimum_should_match parameter
- rewrite parameter
- Regular expression syntax
Aggregations
- Bucket aggregations
- Metrics aggregations
  - Avg
  - Boxplot
  - Cardinality
  - Extended stats
  - Geo-bounds
  - Geo-centroid
  - Geo-Line
  - Matrix stats
  - Max
  - Median absolute deviation
  - Min
  - Percentile ranks
  - Percentiles
  - Rate
  - Scripted metric
  - Stats
  - String stats
  - Sum
  - T-test
  - Top hits
  - Top metrics
  - Value count
  - Weighted avg
- Pipeline aggregations
EQL
- Syntax reference
- Function reference
- Pipe reference
- Example: Detect threats with EQL
SQL
- Overview
- Getting Started with SQL
- Conventions and Terminology
  - Mapping concepts across SQL and Elasticsearch
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
  - API usage
- SQL ODBC
  - Driver installation
  - Configuration
- SQL Client Applications
- SQL Language
- Functions and Operators
- Reserved keywords
- SQL Limitations
Scripting
- Painless scripting language
- How to write scripts
- Common scripting use cases
  - Field extraction
- Accessing document fields and special variables
- Scripting and security
- Lucene expressions language
- Advanced scripts using script engines
Data management
- Data tiers
- Index management
ILM: Manage the index lifecycle
- Overview
- Concepts
- Automate rollover
- Customize built-in ILM policies
- Index lifecycle actions
  - Allocate
  - Delete
  - Force merge
  - Freeze
  - Migrate
  - Read only
  - Rollover
  - Searchable snapshot
  - Set priority
  - Shrink
  - Unfollow
  - Wait for snapshot
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Troubleshooting index lifecycle management errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
Autoscaling
- Autoscaling deciders
Monitor a cluster
- Overview
- How it works
- Monitoring in a production environment
- Collecting monitoring data with Metricbeat
- Collecting log data with Filebeat
- Configuring indices for monitoring
- Legacy collection methods
- Troubleshooting
Roll up or transform your data
- Rolling up historical data
- Transforming data
Set up a cluster for high availability
- Designing for resilience
  - Resilience in small clusters
  - Resilience in larger clusters
- Cross-cluster replication
Snapshot and restore
- Register a repository
- Create a snapshot
- Restore a snapshot
- Searchable snapshots
Secure the Elastic Stack
- Elasticsearch security principles
- Configuring security
- Updating node security certificates
  - With the same CA
  - With a different CA
- User authentication
- User authorization
- Enable audit logging
- Restricting connections with IP filtering
  - Separating node-to-node and client traffic
- Securing clients and integrations
- Operator privileges
- Troubleshooting
- Limitations
Watcher
- Getting started with Watcher
- How Watcher works
- Encrypting sensitive data in Watcher
- Inputs
- Triggers
  - Schedule trigger
- Conditions
- Actions
- Transforms
- Java API
- Managing watches
- Example watches
  - Watching the status of an Elasticsearch cluster
  - Watching event data
- Troubleshooting
- Limitations
Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-croneval
- elasticsearch-keystore
- elasticsearch-migrate
- elasticsearch-node
- elasticsearch-saml-metadata
- elasticsearch-service-tokens
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
How to
- General recommendations
- Recipes
- Tune for indexing speed
- Tune for search speed
- Tune for disk usage
- Fix common cluster issues
- Size your shards
- Use Elasticsearch for time series data
REST APIs
- API conventions
  - Multi-target syntax
  - Date math support in index and index alias names
  - Cron expressions
  - Common options
  - URL-based access control
- Autoscaling APIs
  - Create or update autoscaling policy
  - Get autoscaling capacity
  - Delete autoscaling policy
  - Get autoscaling policy
- Compact and aligned text (CAT) APIs
  - cat aliases
  - cat allocation
  - cat anomaly detectors
  - cat count
  - cat data frame analytics
  - cat datafeeds
  - cat fielddata
  - cat health
  - cat indices
  - cat master
  - cat nodeattrs
  - cat nodes
  - cat pending tasks
  - cat plugins
  - cat recovery
  - cat repositories
  - cat segments
  - cat shards
  - cat snapshots
  - cat task management
  - cat templates
  - cat thread pool
  - cat trained model
  - cat transforms
- Cluster APIs
  - Cluster allocation explain
  - Cluster get settings
  - Cluster health
  - Cluster reroute
  - Cluster state
  - Cluster stats
  - Cluster update settings
  - Nodes feature usage
  - Nodes hot threads
  - Nodes info
  - Nodes reload secure settings
  - Nodes stats
  - Pending cluster tasks
  - Remote cluster info
  - Task management
  - Voting configuration exclusions
- Cross-cluster replication APIs
  - Get CCR stats
  - Create follower
  - Pause follower
  - Resume follower
  - Unfollow
  - Forget follower
  - Get follower stats
  - Get follower info
  - Create auto-follow pattern
  - Delete auto-follow pattern
  - Get auto-follow pattern
  - Pause auto-follow pattern
  - Resume auto-follow pattern
- Data stream APIs
  - Create data stream
  - Delete data stream
  - Get data stream
  - Migrate to data stream
  - Data stream stats
  - Promote data stream
  - Modify data streams
- Document APIs
  - Reading and Writing documents
  - Index
  - Get
  - Delete
  - Delete by query
  - Update
  - Update by query
  - Multi get
  - Bulk
  - Reindex
  - Term vectors
  - Multi term vectors
  - ?refresh
  - Optimistic concurrency control
- Enrich APIs
  - Create enrich policy
  - Delete enrich policy
  - Get enrich policy
  - Execute enrich policy
  - Enrich stats
- EQL APIs
  - Delete async EQL search
  - EQL search
  - Get async EQL search
  - Get async EQL search status
- Features APIs
  - Get features
  - Reset features
- Fleet APIs
  - Get global checkpoints
  - Fleet search
  - Fleet search
- Find structure API
- Graph explore API
- Index APIs
  - Alias exists
  - Aliases
  - Analyze
  - Analyze index disk usage
  - Clear cache
  - Clone index
  - Close index
  - Create index
  - Create or update alias
  - Create or update component template
  - Create or update index template
  - Create or update index template (legacy)
  - Delete component template
  - Delete dangling index
  - Delete alias
  - Delete index
  - Delete index template
  - Delete index template (legacy)
  - Exists
  - Field usage stats
  - Flush
  - Force merge
  - Freeze index
  - Get alias
  - Get component template
  - Get field mapping
  - Get index
  - Get index settings
  - Get index template
  - Get index template (legacy)
  - Get mapping
  - Import dangling index
  - Index recovery
  - Index segments
  - Index shard stores
  - Index stats
  - Index template exists (legacy)
  - List dangling indices
  - Open index
  - Refresh
  - Resolve index
  - Rollover
  - Shrink index
  - Simulate index
  - Simulate template
  - Split index
  - Synced flush
  - Type exists
  - Unfreeze index
  - Update index settings
  - Update mapping
- Index lifecycle management APIs
  - Create or update lifecycle policy
  - Get policy
  - Delete policy
  - Move to step
  - Remove policy
  - Retry policy
  - Get index lifecycle management status
  - Explain lifecycle
  - Start index lifecycle management
  - Stop index lifecycle management
  - Migrate indices and ILM policies to data tiers routing
- Ingest APIs
  - Create or update pipeline
  - Delete pipeline
  - GeoIP stats
  - Get pipeline
  - Simulate pipeline
- Info API
- Licensing APIs
  - Delete license
  - Get license
  - Get trial status
  - Start trial
  - Get basic status
  - Start basic
  - Update license
- Logstash APIs
  - Create or update Logstash pipeline
  - Delete Logstash pipeline
  - Get Logstash pipeline
- Machine learning anomaly detection APIs
  - Add events to calendar
  - Add jobs to calendar
  - Close jobs
  - Create jobs
  - Create calendars
  - Create datafeeds
  - Create filters
  - Delete calendars
  - Delete datafeeds
  - Delete events from calendar
  - Delete filters
  - Delete forecasts
  - Delete jobs
  - Delete jobs from calendar
  - Delete model snapshots
  - Delete expired data
  - Estimate model memory
  - Find file structure
  - Flush jobs
  - Forecast jobs
  - Get buckets
  - Get calendars
  - Get categories
  - Get datafeeds
  - Get datafeed statistics
  - Get influencers
  - Get jobs
  - Get job statistics
  - Get machine learning info
  - Get model snapshots
  - Get model snapshot upgrade statistics
  - Get overall buckets
  - Get scheduled events
  - Get filters
  - Get records
  - Open jobs
  - Post data to jobs
  - Preview datafeeds
  - Reset jobs
  - Revert model snapshots
  - Set upgrade mode
  - Start datafeeds
  - Stop datafeeds
  - Update datafeeds
  - Update filters
  - Update jobs
  - Update model snapshots
  - Upgrade model snapshots
- Machine learning data frame analytics APIs
  - Create data frame analytics jobs
  - Delete data frame analytics jobs
  - Evaluate data frame analytics
  - Explain data frame analytics
  - Get data frame analytics jobs
  - Get data frame analytics jobs stats
  - Preview data frame analytics
  - Start data frame analytics jobs
  - Stop data frame analytics jobs
  - Update data frame analytics jobs
- Machine learning trained model APIs
  - Create or update trained model aliases
  - Create trained models
  - Delete trained model aliases
  - Delete trained models
  - Get trained models
  - Get trained models stats
- Migration APIs
  - Deprecation info
  - Feature upgrade APIs
- Node lifecycle APIs
  - Put shutdown API
  - Get shutdown API
  - Delete shutdown API
- Reload search analyzers API
- Repositories metering APIs
  - Get repositories metering information
  - Clear repositories metering archive
- Rollup APIs
  - Create rollup jobs
  - Delete rollup jobs
  - Get job
  - Get rollup caps
  - Get rollup index caps
  - Rollup search
  - Start rollup jobs
  - Stop rollup jobs
- Script APIs
  - Create or update stored script
  - Delete stored script
  - Get script contexts
  - Get script languages
  - Get stored script
- Search APIs
  - Search
  - Async search
  - Point in time
  - Scroll
  - Clear scroll
  - Search template
  - Multi search template
  - Render search template
  - Search shards
  - Suggesters
  - Multi search
  - Count
  - Validate
  - Terms enum
  - Explain
  - Profile
  - Field capabilities
  - Ranking evaluation
  - Vector tile search
- Searchable snapshots APIs
  - Mount snapshot
  - Cache stats
  - Searchable snapshot statistics
  - Clear cache
- Security APIs
  - Authenticate
  - Change passwords
  - Clear cache
  - Clear roles cache
  - Clear privileges cache
  - Clear API key cache
  - Clear service account token caches
  - Create API keys
  - Create or update application privileges
  - Create or update role mappings
  - Create or update roles
  - Create or update users
  - Create service account tokens
  - Delegate PKI authentication
  - Delete application privileges
  - Delete role mappings
  - Delete roles
  - Delete service account token
  - Delete users
  - Disable users
  - Enable users
  - Get API key information
  - Get application privileges
  - Get builtin privileges
  - Get role mappings
  - Get roles
  - Get service accounts
  - Get service account credentials
  - Get token
  - Get user privileges
  - Get users
  - Grant API keys
  - Has privileges
  - Invalidate API key
  - Invalidate token
  - OpenID Connect prepare authentication
  - OpenID Connect authenticate
  - OpenID Connect logout
  - Query API key information
  - SAML prepare authentication
  - SAML authenticate
  - SAML logout
  - SAML invalidate
  - SAML complete logout
  - SAML service provider metadata
  - SSL certificate
- Snapshot and restore APIs
  - Create or update snapshot repository
  - Verify snapshot repository
  - Repository analysis
  - Get snapshot repository
  - Delete snapshot repository
  - Clean up snapshot repository
  - Clone snapshot
  - Create snapshot
  - Get snapshot
  - Get snapshot status
  - Restore snapshot
  - Delete snapshot
- Snapshot lifecycle management APIs
  - Create or update policy
  - Get policy
  - Delete policy
  - Execute snapshot lifecycle policy
  - Execute snapshot retention policy
  - Get snapshot lifecycle management status
  - Get snapshot lifecycle stats
  - Start snapshot lifecycle management
  - Stop snapshot lifecycle management
- SQL APIs
  - Clear SQL cursor
  - Delete async SQL search
  - Get async SQL search
  - Get async SQL search status
  - SQL search
  - SQL translate
- Transform APIs
  - Create transform
  - Delete transform
  - Get transforms
  - Get transform statistics
  - Preview transform
  - Start transform
  - Stop transforms
  - Update transform
  - Upgrade transforms
- Usage API
- Watcher APIs
  - Ack watch
  - Activate watch
  - Deactivate watch
  - Delete watch
  - Execute watch
  - Get watch
  - Get Watcher stats
  - Query watches
  - Create or update watch
  - Start watch service
  - Stop watch service
- Definitions
  - Role mapping resources
Migration guide
- 7.16
  - Transient settings migration guide
- 7.15
- 7.14
- 7.13
- 7.12
- 7.11
- 7.10
- 7.9
- 7.8
- 7.7
- 7.6
- 7.5
- 7.4
- 7.3
- 7.2
- 7.1
- 7.0
  - Java time migration guide
Release notes
- Elasticsearch version 7.16.3
- Elasticsearch version 7.16.2
- Elasticsearch version 7.16.1
- Elasticsearch version 7.16.0
- Elasticsearch version 7.15.2
- Elasticsearch version 7.15.1
- Elasticsearch version 7.15.0
- Elasticsearch version 7.14.2
- Elasticsearch version 7.14.1
- Elasticsearch version 7.14.0
- Elasticsearch version 7.13.4
- Elasticsearch version 7.13.3
- Elasticsearch version 7.13.2
- Elasticsearch version 7.13.1
- Elasticsearch version 7.13.0
- Elasticsearch version 7.12.1
- Elasticsearch version 7.12.0
- Elasticsearch version 7.11.2
- Elasticsearch version 7.11.1
- Elasticsearch version 7.11.0
- Elasticsearch version 7.10.2
- Elasticsearch version 7.10.1
- Elasticsearch version 7.10.0
- Elasticsearch version 7.9.3
- Elasticsearch version 7.9.2
- Elasticsearch version 7.9.1
- Elasticsearch version 7.9.0
- Elasticsearch version 7.8.1
- Elasticsearch version 7.8.0
- Elasticsearch version 7.7.1
- Elasticsearch version 7.7.0
- Elasticsearch version 7.6.2
- Elasticsearch version 7.6.1
- Elasticsearch version 7.6.0
- Elasticsearch version 7.5.2
- Elasticsearch version 7.5.1
- Elasticsearch version 7.5.0
- Elasticsearch version 7.4.2
- Elasticsearch version 7.4.1
- Elasticsearch version 7.4.0
- Elasticsearch version 7.3.2
- Elasticsearch version 7.3.1
- Elasticsearch version 7.3.0
- Elasticsearch version 7.2.1
- Elasticsearch version 7.2.0
- Elasticsearch version 7.1.1
- Elasticsearch version 7.1.0
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1
Dependencies and versions

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Global aggregation IP range aggregation »

› › ›

Histogram aggregation

edit

Histogram aggregation

edit

A multi-bucket values source based aggregation that can be applied on numeric values or numeric range values extracted from the documents. It dynamically builds fixed size (a.k.a. interval) buckets over the values. For example, if the documents have a field that holds a price (numeric), we can configure this aggregation to dynamically build buckets with interval 5 (in case of price it may represent $5). When the aggregation executes, the price field of every document will be evaluated and will be rounded down to its closest bucket - for example, if the price is 32 and the bucket size is 5 then the rounding will yield 30 and thus the document will "fall" into the bucket that is associated with the key 30. To make this more formal, here is the rounding function that is used:

bucket_key = Math.floor((value - offset) / interval) * interval + offset

For range values, a document can fall into multiple buckets. The first bucket is computed from the lower bound of the range in the same way as a bucket for a single value is computed. The final bucket is computed in the same way from the upper bound of the range, and the range is counted in all buckets in between and including those two.

The interval must be a positive decimal, while the offset must be a decimal in [0, interval) (a decimal greater than or equal to 0 and less than interval)

The following snippet "buckets" the products based on their price by interval of 50:

POST /sales/_search?size=0
{
  "aggs": {
    "prices": {
      "histogram": {
        "field": "price",
        "interval": 50
      }
    }
  }
}

Copy as curl Try in Elastic

And the following may be the response:

{
  ...
  "aggregations": {
    "prices": {
      "buckets": [
        {
          "key": 0.0,
          "doc_count": 1
        },
        {
          "key": 50.0,
          "doc_count": 1
        },
        {
          "key": 100.0,
          "doc_count": 0
        },
        {
          "key": 150.0,
          "doc_count": 2
        },
        {
          "key": 200.0,
          "doc_count": 3
        }
      ]
    }
  }
}

Minimum document count

edit

The response above show that no documents has a price that falls within the range of [100, 150). By default the response will fill gaps in the histogram with empty buckets. It is possible change that and request buckets with a higher minimum count thanks to the min_doc_count setting:

POST /sales/_search?size=0
{
  "aggs": {
    "prices": {
      "histogram": {
        "field": "price",
        "interval": 50,
        "min_doc_count": 1
      }
    }
  }
}

Copy as curl Try in Elastic

Response:

{
  ...
  "aggregations": {
    "prices": {
      "buckets": [
        {
          "key": 0.0,
          "doc_count": 1
        },
        {
          "key": 50.0,
          "doc_count": 1
        },
        {
          "key": 150.0,
          "doc_count": 2
        },
        {
          "key": 200.0,
          "doc_count": 3
        }
      ]
    }
  }
}

By default the histogram returns all the buckets within the range of the data itself, that is, the documents with the smallest values (on which with histogram) will determine the min bucket (the bucket with the smallest key) and the documents with the highest values will determine the max bucket (the bucket with the highest key). Often, when requesting empty buckets, this causes a confusion, specifically, when the data is also filtered.

To understand why, let’s look at an example:

Lets say the you’re filtering your request to get all docs with values between 0 and 500, in addition you’d like to slice the data per price using a histogram with an interval of 50. You also specify "min_doc_count" : 0 as you’d like to get all buckets even the empty ones. If it happens that all products (documents) have prices higher than 100, the first bucket you’ll get will be the one with 100 as its key. This is confusing, as many times, you’d also like to get those buckets between 0 - 100.

With extended_bounds setting, you now can "force" the histogram aggregation to start building buckets on a specific min value and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense when min_doc_count is 0 (the empty buckets will never be returned if min_doc_count is greater than 0).

Note that (as the name suggest) extended_bounds is not filtering buckets. Meaning, if the extended_bounds.min is higher than the values extracted from the documents, the documents will still dictate what the first bucket will be (and the same goes for the extended_bounds.max and the last bucket). For filtering buckets, one should nest the histogram aggregation under a range filter aggregation with the appropriate from/to settings.

Example:

POST /sales/_search?size=0
{
  "query": {
    "constant_score": { "filter": { "range": { "price": { "to": "500" } } } }
  },
  "aggs": {
    "prices": {
      "histogram": {
        "field": "price",
        "interval": 50,
        "extended_bounds": {
          "min": 0,
          "max": 500
        }
      }
    }
  }
}

Copy as curl Try in Elastic

When aggregating ranges, buckets are based on the values of the returned documents. This means the response may include buckets outside of a query’s range. For example, if your query looks for values greater than 100, and you have a range covering 50 to 150, and an interval of 50, that document will land in 3 buckets - 50, 100, and 150. In general, it’s best to think of the query and aggregation steps as independent - the query selects a set of documents, and then the aggregation buckets those documents without regard to how they were selected. See note on bucketing range fields for more information and an example.

The hard_bounds is a counterpart of extended_bounds and can limit the range of buckets in the histogram. It is particularly useful in the case of open data ranges that can result in a very large number of buckets.

Example:

POST /sales/_search?size=0
{
  "query": {
    "constant_score": { "filter": { "range": { "price": { "to": "500" } } } }
  },
  "aggs": {
    "prices": {
      "histogram": {
        "field": "price",
        "interval": 50,
        "hard_bounds": {
          "min": 100,
          "max": 200
        }
      }
    }
  }
}

Copy as curl Try in Elastic

In this example even though the range specified in the query is up to 500, the histogram will only have 2 buckets starting at 100 and 150. All other buckets will be omitted even if documents that should go to this buckets are present in the results.

Order

edit

By default the returned buckets are sorted by their key ascending, though the order behaviour can be controlled using the order setting. Supports the same order functionality as the Terms Aggregation.

Offset

edit

By default the bucket keys start with 0 and then continue in even spaced steps of interval, e.g. if the interval is 10, the first three buckets (assuming there is data inside them) will be [0, 10), [10, 20), [20, 30). The bucket boundaries can be shifted by using the offset option.

This can be best illustrated with an example. If there are 10 documents with values ranging from 5 to 14, using interval 10 will result in two buckets with 5 documents each. If an additional offset 5 is used, there will be only one single bucket [5, 15) containing all the 10 documents.

Response Format

edit

By default, the buckets are returned as an ordered array. It is also possible to request the response as a hash instead keyed by the buckets keys:

POST /sales/_search?size=0
{
  "aggs": {
    "prices": {
      "histogram": {
        "field": "price",
        "interval": 50,
        "keyed": true
      }
    }
  }
}

Copy as curl Try in Elastic

Response:

{
  ...
  "aggregations": {
    "prices": {
      "buckets": {
        "0.0": {
          "key": 0.0,
          "doc_count": 1
        },
        "50.0": {
          "key": 50.0,
          "doc_count": 1
        },
        "100.0": {
          "key": 100.0,
          "doc_count": 0
        },
        "150.0": {
          "key": 150.0,
          "doc_count": 2
        },
        "200.0": {
          "key": 200.0,
          "doc_count": 3
        }
      }
    }
  }
}

Missing value

edit

The missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.

POST /sales/_search?size=0
{
  "aggs": {
    "quantity": {
      "histogram": {
        "field": "quantity",
        "interval": 10,
        "missing": 0 
      }
    }
  }
}

Copy as curl Try in Elastic

Documents without a value in the quantity field will fall into the same bucket as documents that have the value 0.

Histogram fields

edit

Running a histogram aggregation over histogram fields computes the total number of counts for each interval.

For example, executing a histogram aggregation against the following index that stores pre-aggregated histograms with latency metrics (in milliseconds) for different networks:

PUT metrics_index/_doc/1
{
  "network.name" : "net-1",
  "latency_histo" : {
      "values" : [1, 3, 8, 12, 15],
      "counts" : [3, 7, 23, 12, 6]
   }
}

PUT metrics_index/_doc/2
{
  "network.name" : "net-2",
  "latency_histo" : {
      "values" : [1, 6, 8, 12, 14],
      "counts" : [8, 17, 8, 7, 6]
   }
}

POST /metrics_index/_search?size=0
{
  "aggs": {
    "latency_buckets": {
      "histogram": {
        "field": "latency_histo",
        "interval": 5
      }
    }
  }
}

Copy as curl Try in Elastic

The histogram aggregation will sum the counts of each interval computed based on the values and return the following output:

{
  ...
  "aggregations": {
    "prices": {
      "buckets": [
        {
          "key": 0.0,
          "doc_count": 18
        },
        {
          "key": 5.0,
          "doc_count": 48
        },
        {
          "key": 10.0,
          "doc_count": 25
        },
        {
          "key": 15.0,
          "doc_count": 6
        }
      ]
    }
  }
}

Histogram aggregation is a bucket aggregation, which partitions documents into buckets rather than calculating metrics over fields like metrics aggregations do. Each bucket represents a collection of documents which sub-aggregations can run on. On the other hand, a histogram field is a pre-aggregated field representing multiple values inside a single field: buckets of numerical data and a count of items/documents for each bucket. This mismatch between the histogram aggregations expected input (expecting raw documents) and the histogram field (that provides summary information) limits the outcome of the aggregation to only the doc counts for each bucket.

Consequently, when executing a histogram aggregation over a histogram field, no sub-aggregations are allowed.

Also, when running histogram aggregation over histogram field the missing parameter is not supported.

« Global aggregation IP range aggregation »

On this page

Minimum document count
Order
Offset
Response Format
Missing value
Histogram fields

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Histogram aggregation

Histogram aggregation

Minimum document count

Order

Offset

Response Format

Missing value

Histogram fields

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards