Store

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

The store module allows you to control how index data is stored.

The index can either be stored in-memory (no persistence) or on-disk (the default). In-memory indices provide better performance at the cost of limiting the index size to the amount of available physical memory.

When using a local gateway (the default), file system storage with no in memory storage is required to maintain index consistency. This is required since the local gateway constructs its state from the local index state of each node.

Another important aspect of memory based storage is the fact that Elasticsearch supports storing the index in memory outside of the JVM heap space using the "Memory" (see below) storage type. It translates to the fact that there is no need for extra large JVM heaps (with their own consequences) for storing the index in memory.

Store Level Throttling

edit

The way Lucene, the IR library elasticsearch uses under the covers, works is by creating immutable segments (up to deletes) and constantly merging them (the merge policy settings allow to control how those merges happen). The merge process happens in an asynchronous manner without affecting the indexing / search speed. The problem though, especially on systems with low IO, is that the merge process can be expensive and affect search / index operation simply by the fact that the box is now taxed with more IO happening.

The store module allows throttling to be configured for merges (or for all writes) either on the node level, or on the index level. Node level throttling makes more sense because all the shards on the node have to compete for the same disk I/O. Node level throttling can be controlled by setting indices.store.throttle.type to merge (the default), all, or none, and setting indices.store.throttle.max_bytes_per_sec to something like 5mb. The node level settings can be changed dynamically using the cluster update settings API. The default is set to 20mb with type merge.

If specific index level configuration is needed, regardless of the node level settings, it can be set as well using the index.store.throttle.type, and index.store.throttle.max_bytes_per_sec. The default value for the type is node, meaning it will be subject to the node level settings and participate in the node-wide throttling. Both settings can be set using the index update settings API dynamically.

File system storage types

edit

File system based storage is the default storage used. There are different implementations or storage types. The best one for the operating environment will be automatically chosen: mmapfs on Windows 64bit, simplefs on Windows 32bit, and default (hybrid niofs and mmapfs) for the rest.

This can be overridden for all indices by adding this to the config/elasticsearch.yml file:

index.store.type: niofs

It can also be set on a per-index basis at index creation time:

curl -XPUT localhost:9200/my_index -d '{
    "settings": {
        "index.store.type": "niofs"
    }
}';

The following sections lists all the different storage types supported.

Simple FS

edit

The simplefs type is a straightforward implementation of file system storage (maps to Lucene SimpleFsDirectory) using a random access file. This implementation has poor concurrent performance (multiple threads will bottleneck). It is usually better to use the niofs when you need index persistence.

NIO FS

edit

The niofs type stores the shard index on the file system (maps to Lucene NIOFSDirectory) using NIO. It allows multiple threads to read from the same file concurrently. It is not recommended on Windows because of a bug in the SUN Java implementation.

MMap FS

edit

The mmapfs type stores the shard index on the file system (maps to Lucene MMapDirectory) by mapping a file into memory (mmap). Memory mapping uses up a portion of the virtual memory address space in your process equal to the size of the file being mapped. Before using this class, be sure your have plenty of virtual address space. See Virtual memory

Hybrid MMap / NIO FS

edit

The default type stores the shard index on the file system depending on the file type by mapping a file into memory (mmap) or using Java NIO. Currently only the Lucene term dictionary and doc values files are memory mapped to reduce the impact on the operating system. All other files are opened using Lucene NIOFSDirectory. Address space settings (Virtual memory) might also apply if your term dictionaries are large.

Memory

edit

The memory type stores the index in main memory, using Lucene’s RamIndexStore.