Shared file system repository
editShared file system repository
editThis repository type is only available if you run Elasticsearch on your own hardware. If you use Elasticsearch Service, see Elasticsearch Service repository types.
Use a shared file system repository to store snapshots on a shared file system.
To register a shared file system repository, first mount the file system to the
same location on all master and data nodes. Then add the file system’s path or
parent directory to the path.repo
setting in elasticsearch.yml
for each
master and data node. For running clusters, this requires a
rolling restart of each node.
Supported path.repo
values vary by platform:
Linux and macOS installations support Unix-style paths:
path: repo: - /mount/backups - /mount/long_term_backups
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "/mount/backups/my_fs_backup_location" } }
If you specify a relative path, Elasticsearch resolves the path using the first value in
the path.repo
setting.
response = client.snapshot.create_repository( repository: 'my_fs_backup', body: { type: 'fs', settings: { location: 'my_fs_backup_location' } } ) puts response
The first value in the |
Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot
repository API, set the readonly
parameter to true. Alternatively, you can
register a URL repository for the file
system.
response = client.snapshot.create_repository( repository: 'my_fs_backup', body: { type: 'fs', settings: { location: 'my_fs_backup_location', readonly: true } } ) puts response
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "my_fs_backup_location", "readonly": true } }
Windows installations support both DOS and Microsoft UNC paths. Escape any backslashes in the paths. For UNC paths, provide the server and share name as a prefix.
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
response = client.snapshot.create_repository( repository: 'my_fs_backup', body: { type: 'fs', settings: { location: 'E:\\Mount\\Backups\\My_fs_backup_location' } } ) puts response
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "E:\\Mount\\Backups\\My_fs_backup_location" } }
If you specify a relative path, Elasticsearch resolves the path using the first value in
the path.repo
setting.
response = client.snapshot.create_repository( repository: 'my_fs_backup', body: { type: 'fs', settings: { location: 'My_fs_backup_location' } } ) puts response
The first value in the |
Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot
repository API, set the readonly
parameter to true. Alternatively, you can
register a URL repository for the file
system.
response = client.snapshot.create_repository( repository: 'my_fs_backup', body: { type: 'fs', settings: { location: 'my_fs_backup_location', readonly: true } } ) puts response
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "my_fs_backup_location", "readonly": true } }
Repository settings
edit-
chunk_size
-
(Optional, byte value)
Maximum size of files in snapshots. In snapshots, files larger than this are
broken down into chunks of this size or smaller. Defaults to
null
(unlimited file size). -
compress
-
(Optional, Boolean)
If
true
, metadata files, such as index mappings and settings, are compressed in snapshots. Data files are not compressed. Defaults totrue
. -
location
-
(Required, string)
Location of the shared filesystem used to store and retrieve snapshots. This
location must be registered in the
path.repo
setting on all master and data nodes in the cluster. -
max_number_of_snapshots
-
(Optional, integer)
Maximum number of snapshots the repository can contain.
Defaults to
Integer.MAX_VALUE
, which is2^31-1
or2147483647
. -
max_restore_bytes_per_sec
- (Optional, byte value) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through recovery settings.
-
max_snapshot_bytes_per_sec
-
(Optional, byte value)
Maximum snapshot creation rate per node. Defaults to
40mb
per second. Note that if the recovery settings for managed services are set, then it defaults to unlimited, and the rate is additionally throttled through recovery settings.
-
readonly
-
(Optional, Boolean) If
true
, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it.Only a cluster with write access can create snapshots in the repository. All other clusters connected to the repository should have the
readonly
parameter set totrue
.If
false
, the cluster can write to the repository and create snapshots in it. Defaults tofalse
.If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository.
Troubleshooting a shared file system repository
editElasticsearch interacts with a shared file system repository using the file system abstraction in your operating system. This means that every Elasticsearch node must be able to perform operations within the repository path such as creating, opening, and renaming files, and creating and listing directories, and operations performed by one node must be visible to other nodes as soon as they complete.
Check for common misconfigurations using the Verify snapshot repository API and the Repository analysis API. When the repository is properly configured, these APIs will complete successfully. If the verify repository or repository analysis APIs report a problem then you will be able to reproduce this problem outside Elasticsearch by performing similar operations on the file system directly.
If the verify repository or repository analysis APIs fail with an error
indicating insufficient permissions then adjust the configuration of the
repository within your operating system to give Elasticsearch an appropriate level of
access. To reproduce such problems directly, perform the same operations as
Elasticsearch in the same security context as the one in which Elasticsearch is running. For
example, on Linux, use a command such as su
to switch to the user as which
Elasticsearch runs.
If the verify repository or repository analysis APIs fail with an error indicating that operations on one node are not immediately visible on another node then adjust the configuration of the repository within your operating system to address this problem. If your repository cannot be configured with strong enough visibility guarantees then it is not suitable for use as an Elasticsearch snapshot repository.
The verify repository and repository analysis APIs will also fail if the operating system returns any other kind of I/O error when accessing the repository. If this happens, address the cause of the I/O error reported by the operating system.
Many NFS implementations match accounts across nodes using their numeric
user IDs (UIDs) and group IDs (GIDs) rather than their names. It is possible
for Elasticsearch to run under an account with the same name (often elasticsearch
) on
each node, but for these accounts to have different numeric user or group IDs.
If your shared file system uses NFS then ensure that every node is running with
the same numeric UID and GID, or else update your NFS configuration to account
for the variance in numeric IDs across nodes.
Linearizable register implementation
editThe linearizable register implementation for shared filesystem repositories is based around file locking. To perform a compare-and-exchange operation on a register, Elasticsearch first locks he underlying file and then writes the updated contents under the same lock. This ensures that the file has not changed in the meantime.