doc_values

edit

Most fields are indexed by default, which makes them searchable. The inverted index allows queries to look up the search term in unique sorted list of terms, and from that immediately have access to the list of documents that contain the term.

Sorting, aggregations, and access to field values in scripts requires a different data access pattern. Instead of looking up the term and finding documents, we need to be able to look up the document and find the terms that it has in a field.

Doc values are the on-disk data structure, built at document index time, which makes this data access pattern possible. They store the same values as the _source but in a column-oriented fashion that is way more efficient for sorting and aggregations. Doc values are supported on almost all field types, with the notable exception of text and annotated_text fields.

Doc-value-only fields

edit

Numeric types, date types, the boolean type, ip type, geo_point type and the keyword type can also be queried when they are not indexed but only have doc values enabled. Query performance on doc values is much slower than on index structures, but offers an interesting tradeoff between disk usage and query performance for fields that are only rarely queried and where query performance is not as important. This makes doc-value-only fields a good fit for fields that are not expected to be normally used for filtering, for example gauges or counters on metric data.

Doc-value-only fields can be configured as follows:

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "status_code": { 
        "type":  "long"
      },
      "session_id": { 
        "type":  "long",
        "index": false
      }
    }
  }
}

The status_code field is a regular long field.

The session_id field has index disabled, and is therefore a doc-value-only long field as doc values are enabled by default.

Disabling doc values

edit

All fields which support doc values have them enabled by default. If you are sure that you don’t need to sort or aggregate on a field, or access the field value from a script, you can disable doc values in order to save disk space:

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "status_code": { 
        "type":       "keyword"
      },
      "session_id": { 
        "type":       "keyword",
        "doc_values": false
      }
    }
  }
}

The status_code field has doc_values enabled by default.

The session_id has doc_values disabled, but can still be queried.

You cannot disable doc values for wildcard fields.