Indices, documents, and fields

edit

The index is the fundamental unit of storage in Elasticsearch, a logical namespace for storing data that share similar characteristics. After you have Elasticsearch deployed, you’ll get started by creating an index to store your data.

A closely related concept is a data stream. This index abstraction is optimized for append-only time-series data, and is made up of hidden, auto-generated backing indices. If you’re working with time-series data, we recommend the Elastic Observability solution.

Some key facts about indices:

  • An index is a collection of documents
  • An index has a unique name
  • An index can also be referred to by an alias
  • An index has a mapping that defines the schema of its documents

Documents and fields

edit

Elasticsearch serializes and stores data in the form of JSON documents. A document is a set of fields, which are key-value pairs that contain your data. Each document has a unique ID, which you can create or have Elasticsearch auto-generate.

A simple Elasticsearch document might look like this:

{
  "_index": "my-first-elasticsearch-index",
  "_id": "DyFpo5EBxE8fzbb95DOa",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "email": "john@smith.com",
    "first_name": "John",
    "last_name": "Smith",
    "info": {
      "bio": "Eco-warrior and defender of the weak",
      "age": 25,
      "interests": [
        "dolphins",
        "whales"
      ]
    },
    "join_date": "2024/05/01"
  }
}

Data and metadata

edit

An indexed document contains data and metadata. In Elasticsearch, metadata fields are prefixed with an underscore.

The most important metadata fields are:

  • _source: Contains the original JSON document.
  • _index: The name of the index where the document is stored.
  • _id: The document’s ID. IDs must be unique per index.

Mappings and data types

edit

Each index has a mapping or schema for how the fields in your documents are indexed. A mapping defines the data type for each field, how the field should be indexed, and how it should be stored. When adding documents to Elasticsearch, you have two options for mappings:

  • Dynamic mapping: Let Elasticsearch automatically detect the data types and create the mappings for you. This is great for getting started quickly, but can lead to unexpected results for complex data.
  • Explicit mapping: Define the mappings up front by specifying data types for each field. Recommended for production use cases, because you have much more control over how your data is indexed.

You can use a combination of dynamic and explicit mapping on the same index. This is useful when you have a mix of known and unknown fields in your data.