Mapping

edit

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define:

  • which string fields should be treated as full text fields.
  • which fields contain numbers, dates, or geolocations.
  • whether the values of all fields in the document should be indexed into the catch-all _all field.
  • the format of date values.
  • custom rules to control the mapping for dynamically added fields.

Mapping Types

edit

Each index has one or more mapping types, which are used to divide the documents in an index into logical groups. User documents might be stored in a user type, and blog posts in a blogpost type.

Each mapping type has:

Meta-fields
Meta-fields are used to customize how a document’s metadata associated is treated. Examples of meta-fields include the document’s _index, _type, _id, and _source fields.
Fields or properties
Each mapping type contains a list of fields or properties pertinent to that type. A user type might contain title, name, and age fields, while a blogpost type might contain title, body, user_id and created fields. Fields with the same name in different mapping types in the same index must have the same mapping.

Field datatypes

edit

Each field has a data type which can be:

It is often useful to index the same field in different ways for different purposes. For instance, a string field could be indexed as a text field for full-text search, and as a keyword field for sorting or aggregations. Alternatively, you could index a string field with the standard analyzer, the english analyzer, and the french analyzer.

This is the purpose of multi-fields. Most datatypes support multi-fields via the fields parameter.

Settings to prevent mappings explosion

edit

The following settings allow you to limit the number of field mappings that can be created manually or dynamically, in order to prevent bad documents from causing a mapping explosion:

index.mapping.total_fields.limit
The maximum number of fields in an index. The default value is 1000.
index.mapping.depth.limit
The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc. The default is 20.
index.mapping.nested_fields.limit
The maximum number of nested fields in an index, defaults to 50. Indexing 1 document with 100 nested fields actually indexes 101 documents as each nested document is indexed as a separate hidden document.

Dynamic mapping

edit

Fields and mapping types do not need to be defined before being used. Thanks to dynamic mapping, new mapping types and new field names will be added automatically, just by indexing a document. New fields can be added both to the top-level mapping type, and to inner object and nested fields.

The dynamic mapping rules can be configured to customise the mapping that is used for new types and new fields.

Explicit mappings

edit

You know more about your data than Elasticsearch can guess, so while dynamic mapping can be useful to get started, at some point you will want to specify your own explicit mappings.

You can create mapping types and field mappings when you create an index, and you can add mapping types and fields to an existing index with the PUT mapping API.

Updating existing mappings

edit

Other than where documented, existing type and field mappings cannot be updated. Changing the mapping would mean invalidating already indexed documents. Instead, you should create a new index with the correct mappings and reindex your data into that index.

Fields are shared across mapping types

edit

Mapping types are used to group fields, but the fields in each mapping type are not independent of each other. Fields with:

  • the same name
  • in the same index
  • in different mapping types
  • map to the same field internally,
  • and must have the same mapping.

If a title field exists in both the user and blogpost mapping types, the title fields must have exactly the same mapping in each type. The only exceptions to this rule are the copy_to, dynamic, enabled, ignore_above, include_in_all, and properties parameters, which may have different settings per field.

Usually, fields with the same name also contain the same type of data, so having the same mapping is not a problem. When conflicts do arise, these can be solved by choosing more descriptive names, such as user_title and blog_title.

Example mapping

edit

A mapping for the example described above could be specified when creating the index, as follows:

PUT my_index 
{
  "mappings": {
    "user": { 
      "_all":       { "enabled": false  }, 
      "properties": { 
        "title":    { "type": "text"  }, 
        "name":     { "type": "text"  }, 
        "age":      { "type": "integer" }  
      }
    },
    "blogpost": { 
      "_all":       { "enabled": false  }, 
      "properties": { 
        "title":    { "type": "text"  }, 
        "body":     { "type": "text"  }, 
        "user_id":  {
          "type":   "keyword" 
        },
        "created":  {
          "type":   "date", 
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    }
  }
}

Create an index called my_index.

Add mapping types called user and blogpost.

Disable the _all meta field for the user mapping type.

Specify fields or properties in each mapping type.

Specify the data type and mapping for each field.