Elasticsearch Guide: other versions:
Getting Started
- Basic Concepts
- Installation
- Exploring Your Cluster
- Modifying Your Data
- Exploring Your Data
- Conclusion
Setup Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Bootstrap Checks
- Important System Configuration
- Upgrading Elasticsearch
- Stopping Elasticsearch
Breaking changes
- Breaking changes in 5.1
- Breaking changes in 5.0
API Conventions
- Multiple Indices
- Date math support in index names
- Common options
- URL-based access control
Document APIs
- Reading and Writing documents
- Index API
- Get API
- Delete API
- Delete By Query API
- Update API
- Update By Query API
- Multi Get API
- Bulk API
- Reindex API
- Term Vectors
- Multi termvectors API
- ?refresh
Search APIs
- Search
- URI Search
- Request Body Search
  - Query
  - From / Size
  - Sort
  - Source filtering
  - Fields
  - Script Fields
  - Doc value Fields
  - Post filter
  - Highlighting
  - Rescoring
  - Search Type
  - Scroll
  - Preference
  - Explain
  - Version
  - Index Boost
  - min_score
  - Named Queries
  - Inner hits
  - Search After
- Search Template
- Multi Search Template
- Search Shards API
- Suggesters
- Multi Search API
- Count API
- Validate API
- Explain API
- Profile API
- Percolator
- Field stats API
Aggregations
- Metrics Aggregations
- Bucket Aggregations
- Pipeline Aggregations
- Matrix Aggregations
  - Matrix Stats
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
  - Explain Analyze
- Index Templates
- Shadow replica indices
  - Node level settings related to shadow replicas
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
  - Synced Flush
- Refresh
- Force Merge
cat APIs
- cat aliases
- cat allocation
- cat count
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat thread pool
- cat shards
- cat segments
- cat snapshots
- cat templates
Cluster APIs
- Cluster Health
- Cluster State
- Cluster Stats
- Pending cluster tasks
- Cluster Reroute
- Cluster Update Settings
- Nodes Stats
- Nodes Info
- Task Management API
- Nodes hot_threads
- Cluster Allocation Explain API
Query DSL
- Query and filter context
- Match All Query
- Full text queries
- Term level queries
- Compound queries
- Joining queries
- Geo queries
- Specialized queries
- Span queries
- Minimum Should Match
- Multi Term Query Rewrite
Mapping
- Field datatypes
- Meta-Fields
- Mapping parameters
- Dynamic Mapping
Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Tokenizers
- Token Filters
- Character Filters
Modules
- Cluster
- Discovery
- Local Gateway
- HTTP
- Indices
- Network Settings
- Node
- Plugins
- Scripting
- Snapshot And Restore
- Thread Pool
- Transport
- Tribe node
Index Modules
- Analysis
- Index Shard Allocation
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Pre-loading data into the file system cache
- Translog
Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Handling Failures in Pipelines
- Processors
How To
- General recommendations
- Recipes
- Tune for indexing speed
- Tune for search speed
- Tune for disk usage
Testing
- Java Testing Framework
Glossary of terms
Release Notes
- 5.1.2 Release Notes
- 5.1.1 Release Notes
- 5.1.0 Release Notes
- 5.0.2 Release Notes
- 5.0.1 Release Notes
- 5.0.0 Combined Release Notes
- 5.0.0 GA Release Notes
- 5.0.0-rc1 Release Notes
- 5.0.0-beta1 Release Notes
- 5.0.0-alpha5 Release Notes
- 5.0.0-alpha4 Release Notes
- 5.0.0-alpha3 Release Notes
- 5.0.0-alpha2 Release Notes
- 5.0.0-alpha1 Release Notes
- 5.0.0-alpha1 Release Notes (Changes previously released in 2.x)

WARNING: Version 5.1 of Elasticsearch has passed its EOL date.

This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.

› ›

Field stats API

edit

Field stats API

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

The field stats api allows one to find statistical properties of a field without executing a search, but looking up measurements that are natively available in the Lucene index. This can be useful to explore a dataset which you don’t know much about. For example, this allows creating a histogram aggregation with meaningful intervals based on the min/max range of values.

The field stats api by defaults executes on all indices, but can execute on specific indices too.

All indices:

curl -XGET "http://localhost:9200/_field_stats?fields=rating"

Specific indices:

curl -XGET "http://localhost:9200/index1,index2/_field_stats?fields=rating"

Supported request options:

`fields`	A list of fields to compute stats for. The field name supports wildcard notation. For example, using `text_*` will cause all fields that match the expression to be returned.
`level`	Defines if field stats should be returned on a per index level or on a cluster wide level. Valid values are `indices` and `cluster` (default).

Alternatively the fields option can also be defined in the request body:

curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
   "fields" : ["rating"]
}'

This is equivalent to the previous request.

Field statistics

edit

The field stats api is supported on string based, number based and date based fields and can return the following statistics per field:

`max_doc`	The total number of documents.
`doc_count`	The number of documents that have at least one term for this field, or -1 if this measurement isn’t available on one or more shards.
`density`	The percentage of documents that have at least one value for this field. This is a derived statistic and is based on the `max_doc` and `doc_count`.
`sum_doc_freq`	The sum of each term’s document frequency in this field, or -1 if this measurement isn’t available on one or more shards. Document frequency is the number of documents containing a particular term.
`sum_total_term_freq`	The sum of the term frequencies of all terms in this field across all documents, or -1 if this measurement isn’t available on one or more shards. Term frequency is the total number of occurrences of a term in a particular document and field.

is_searchable

True if any of the instances of the field is searchable, false otherwise.

is_aggregatable

True if any of the instances of the field is aggregatable, false otherwise.

min_value: The lowest value in the field.
min_value_as_string: The lowest value in the field represented in a displayable form. All fields, but string fields returns this. (since string fields, represent values already as strings)
max_value: The highest value in the field.
max_value_as_string: The highest value in the field represented in a displayable form. All fields, but string fields returns this. (since string fields, represent values already as strings)

Documents marked as deleted (but not yet removed by the merge process) still affect all the mentioned statistics.

Cluster level field statistics example

edit

Request:

curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name"

Response:

{
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "indices": {
      "_all": { 
         "fields": {
            "creation_date": {
               "max_doc": 1326564,
               "doc_count": 564633,
               "density": 42,
               "sum_doc_freq": 2258532,
               "sum_total_term_freq": -1,
               "min_value": "2008-08-01T16:37:51.513Z",
               "max_value": "2013-06-02T03:23:11.593Z",
               "is_searchable": "true",
               "is_aggregatable": "true"
            },
            "display_name": {
               "max_doc": 1326564,
               "doc_count": 126741,
               "density": 9,
               "sum_doc_freq": 166535,
               "sum_total_term_freq": 166616,
               "min_value": "0",
               "max_value": "정혜선",
               "is_searchable": "true",
               "is_aggregatable": "false"
            },
            "answer_count": {
               "max_doc": 1326564,
               "doc_count": 139885,
               "density": 10,
               "sum_doc_freq": 559540,
               "sum_total_term_freq": -1,
               "min_value": 0,
               "max_value": 160,
               "is_searchable": "true",
               "is_aggregatable": "true"
            },
            "rating": {
               "max_doc": 1326564,
               "doc_count": 437892,
               "density": 33,
               "sum_doc_freq": 1751568,
               "sum_total_term_freq": -1,
               "min_value": -14,
               "max_value": 1277,
               "is_searchable": "true",
               "is_aggregatable": "true"
            }
         }
      }
   }
}

The _all key indicates that it contains the field stats of all indices in the cluster.

When using the cluster level field statistics it is possible to have conflicts if the same field is used in different indices with incompatible types. For instance a field of type long is not compatible with a field of type float or string. A section named conflicts is added to the response if one or more conflicts are raised. It contains all the fields with conflicts and the reason of the incompatibility.

{
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "indices": {
      "_all": {
         "fields": {
            "creation_date": {
               "max_doc": 1326564,
               "doc_count": 564633,
               "density": 42,
               "sum_doc_freq": 2258532,
               "sum_total_term_freq": -1,
               "min_value": "2008-08-01T16:37:51.513Z",
               "max_value": "2013-06-02T03:23:11.593Z",
               "is_searchable": "true",
               "is_aggregatable": "true"
            }
         }
      }
   },
   "conflicts": {
        "field_name_in_conflict1": "reason1",
        "field_name_in_conflict2": "reason2"
   }
}

Indices level field statistics example

edit

Request:

curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name&level=indices"

Response:

{
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "indices": {
      "stack": { 
         "fields": {
            "creation_date": {
               "max_doc": 1326564,
               "doc_count": 564633,
               "density": 42,
               "sum_doc_freq": 2258532,
               "sum_total_term_freq": -1,
               "min_value": "2008-08-01T16:37:51.513Z",
               "max_value": "2013-06-02T03:23:11.593Z",
               "is_searchable": "true",
               "is_aggregatable": "true"
            },
            "display_name": {
               "max_doc": 1326564,
               "doc_count": 126741,
               "density": 9,
               "sum_doc_freq": 166535,
               "sum_total_term_freq": 166616,
               "min_value": "0",
               "max_value": "정혜선",
               "is_searchable": "true",
               "is_aggregatable": "false"
            },
            "answer_count": {
               "max_doc": 1326564,
               "doc_count": 139885,
               "density": 10,
               "sum_doc_freq": 559540,
               "sum_total_term_freq": -1,
               "min_value": 0,
               "max_value": 160,
               "is_searchable": "true",
               "is_aggregatable": "true"
            },
            "rating": {
               "max_doc": 1326564,
               "doc_count": 437892,
               "density": 33,
               "sum_doc_freq": 1751568,
               "sum_total_term_freq": -1,
               "min_value": -14,
               "max_value": 1277,
               "is_searchable": "true",
               "is_aggregatable": "true"
            }
         }
      }
   }
}

The stack key means it contains all field stats for the stack index.

Field stats index constraints

edit

Field stats index constraints allows to omit all field stats for indices that don’t match with the constraint. An index constraint can exclude indices' field stats based on the min_value and max_value statistic. This option is only useful if the level option is set to indices.

For example index constraints can be useful to find out the min and max value of a particular property of your data in a time based scenario. The following request only returns field stats for the answer_count property for indices holding questions created in the year 2014:

curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
   "fields" : ["answer_count"] 
   "index_constraints" : { 
      "creation_date" : { 
         "max_value" : { 
            "gte" : "2014-01-01T00:00:00.000Z"
         },
         "min_value" : { 
            "lt" : "2015-01-01T00:00:00.000Z"
         }
      }
   }
}'

	The fields to compute and return field stats for.
	The set index constraints. Note that index constrains can be defined for fields that aren’t defined in the `fields` option.
	Index constraints for the field `creation_date`.
	Index constraints on the `max_value` and `min_value` property of a field statistic.

For a field, index constraints can be defined on the min_value statistic, max_value statistic or both. Each index constraint support the following comparisons:

`gte`	Greater-than or equal to
`gt`	Greater-than
`lte`	Less-than or equal to
`lt`	Less-than

Field stats index constraints on date fields optionally accept a format option, used to parse the constraint’s value. If missing, the format configured in the field’s mapping is used.

curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
   "fields" : ["answer_count"]
   "index_constraints" : {
      "creation_date" : {
         "max_value" : {
            "gte" : "2014-01-01",
            "format" : "date_optional_time" 
         },
         "min_value" : {
            "lt" : "2015-01-01",
            "format" : "date_optional_time"
         }
      }
   }
}'

Custom date format

« Percolator Aggregations »

On this page

Field statistics
Cluster level field statistics example
Indices level field statistics example
Field stats index constraints

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Field stats API

Field stats API

Field statistics

Cluster level field statistics example

Indices level field statistics example

Field stats index constraints

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards