Register a snapshot repository
editRegister a snapshot repository
editThis guide shows you how to register a snapshot repository. A snapshot repository is an off-cluster storage location for your snapshots. You must register a repository before you can take or restore snapshots.
In this guide, you’ll learn how to:
- Register a snapshot repository
- Verify that a repository is functional
- Clean up a repository to remove unneeded files
Prerequisites
edit-
To use Kibana’s Snapshot and Restore feature, you must have the following permissions:
-
Cluster privileges:
monitor
,manage_slm
,cluster:admin/snapshot
, andcluster:admin/repository
-
Index privilege:
all
on themonitor
index
-
Cluster privileges:
- To register a snapshot repository, the cluster’s global metadata must be writeable. Ensure there aren’t any cluster blocks that prevent write access.
Considerations
editWhen registering a snapshot repository, keep the following in mind:
- Each snapshot repository is separate and independent. Elasticsearch doesn’t share data between repositories.
-
If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
- Use a different snapshot repository for each major version of Elasticsearch. Mixing snapshots from different major versions can corrupt a repository’s contents.
Manage snapshot repositories
editYou can register and manage snapshot repositories in two ways:
- Kibana’s Snapshot and Restore feature
- Elasticsearch’s snapshot repository management APIs
To manage repositories in Kibana, go to the main menu and click Stack Management > Snapshot and Restore > Repositories. To register a snapshot repository, click Register repository.
Snapshot repository types
editSupported snapshot repository types vary based on your deployment type.
Elasticsearch Service repository types
editElasticsearch Service deployments automatically register the
found-snapshots
repository. Elasticsearch Service uses this
repository and the cloud-snapshot-policy
to take periodic snapshots of your
cluster. You can also use the found-snapshots
repository for your own
SLM policies or to store searchable snapshots.
The found-snapshots
repository is specific to each deployment. However, you
can restore snapshots from another deployment’s found-snapshots
repository if
the deployments are under the same account and in the same region. See the Cloud
Snapshot and restore documentation to learn more.
Elasticsearch Service deployments also support the following repository types:
Self-managed repository types
editIf you run Elasticsearch on your own hardware, you can use the following built-in snapshot repository types:
Other repository types are available through official plugins:
You can also use alternative implementations of these repository types, such as MinIO, as long as they’re compatible. To verify a repository’s compatibility, see Verify a repository.
Shared file system repository
editThis repository type is only available if you run Elasticsearch on your own hardware. If you use Elasticsearch Service, see Elasticsearch Service repository types.
Use a shared file system repository to store snapshots on a shared file system.
To register a shared file system repository, first mount the file system to the
same location on all master and data nodes. Then add the file system’s
path or parent directory to the path.repo
setting in elasticsearch.yml
for
each master and data node. For running clusters, this requires a
rolling restart of each node.
By default, a network file system (NFS) uses user IDs (UIDs) and group IDs (GIDs) to match accounts across nodes. If your shared file system is an NFS and your nodes don’t use the same UIDs and GIDs, update your NFS configuration to account for this.
Supported path.repo
values vary by platform:
Linux and macOS installations support Unix-style paths:
path: repo: - /mount/backups - /mount/long_term_backups
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "/mount/backups/my_fs_backup_location" } }
If you specify a relative path, Elasticsearch resolves the path using the first value in
the path.repo
setting.
The first value in the |
If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot
repository API, set the readonly
parameter to true. Alternatively, you can
register a URL repository for the file
system.
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "my_fs_backup_location", "readonly": true } }
Windows installations support both DOS and Microsoft UNC paths. Escape any backslashes in the paths. For UNC paths, provide the server and share name as a prefix.
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "E:\\Mount\\Backups\\My_fs_backup_location" } }
If you specify a relative path, Elasticsearch resolves the path using the first value in
the path.repo
setting.
The first value in the |
If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot
repository API, set the readonly
parameter to true. Alternatively, you can
register a URL repository for the file
system.
PUT _snapshot/my_fs_backup { "type": "fs", "settings": { "location": "my_fs_backup_location", "readonly": true } }
Read-only URL repository
editThis repository type is only available if you run Elasticsearch on your own hardware. If you use Elasticsearch Service, see Elasticsearch Service repository types.
You can use a URL repository to give a cluster read-only access to a shared file system. Since URL repositories are always read-only, they’re a safer and more convenient alternative to registering a read-only shared filesystem repository.
Use Kibana or the create snapshot repository API to register a URL repository.
PUT _snapshot/my_read_only_url_repository { "type": "url", "settings": { "url": "file:/mount/backups/my_fs_backup_location" } }
Source-only repository
editYou can use a source-only repository to take minimal, source-only snapshots that use up to 50% less disk space than regular snapshots.
Unlike other repository types, a source-only repository doesn’t directly store snapshots. It delegates storage to another registered snapshot repository.
When you take a snapshot using a source-only repository, Elasticsearch creates a source-only snapshot in the delegated storage repository. This snapshot only contains stored fields and metadata. It doesn’t include index or doc values structures and isn’t immediately searchable when restored. To search the restored data, you first have to reindex it into a new data stream or index.
Source-only snapshots are only supported if the _source
field is enabled and no source-filtering is applied.
When you restore a source-only snapshot:
-
The restored index is read-only and can only serve
match_all
search or scroll requests to enable reindexing. -
Queries other than
match_all
and_get
requests are not supported. -
The mapping of the restored index is empty, but the original mapping is available from the types top
level
meta
element.
Before registering a source-only repository, use Kibana or the create snapshot repository API to register a snapshot repository of another type to use for storage. Then register the source-only repository and specify the delegated storage repository in the request.
PUT _snapshot/my_src_only_repository { "type": "source", "settings": { "delegate_type": "fs", "location": "my_backup_location" } }
Verify a repository
editWhen you register a snapshot repository, Elasticsearch automatically verifies that the repository is available and functional on all master and data nodes.
To disable this verification, set the create snapshot
repository API's verify
query parameter to false
. You can’t disable
repository verification in Kibana.
PUT _snapshot/my_unverified_backup?verify=false { "type": "fs", "settings": { "location": "my_unverified_backup_location" } }
If wanted, you can manually run the repository verification check. To verify a repository in Kibana, go to the Repositories list page and click the name of a repository. Then click Verify repository. You can also use the verify snapshot repository API.
POST _snapshot/my_unverified_backup/_verify
If successful, the request returns a list of nodes used to verify the repository. If verification fails, the request returns an error.
You can test a repository more thoroughly using the repository analysis API.
Clean up a repository
editRepositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher than necessary storage use. To remove this unreferenced data, you can run a cleanup operation on the repository. This will trigger a complete accounting of the repository’s contents and delete any unreferenced data.
To run the repository cleanup operation in Kibana, go to the Repositories list page and click the name of a repository. Then click Clean up repository.
You can also use the clean up snapshot repository API.
POST _snapshot/my_repository/_cleanup
The API returns:
{ "results": { "deleted_bytes": 20, "deleted_blobs": 5 } }
Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and subsequently cleaned up.
Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality and should lower your frequency of invoking it accordingly.
Back up a repository
editYou may wish to make an independent backup of your repository, for instance so that you have an archive copy of its contents that you can use to recreate the repository in its current state at a later date.
You must ensure that Elasticsearch does not write to the repository while you are taking
the backup of its contents. You can do this by unregistering it, or registering
it with readonly: true
, on all your clusters. If Elasticsearch writes any data to the
repository during the backup then the contents of the backup may not be
consistent and it may not be possible to recover any data from it in future.
Alternatively, if your repository supports it, you may take an atomic snapshot of the underlying filesystem and then take a backup of this filesystem snapshot. It is very important that the filesystem snapshot is taken atomically.
You cannot use filesystem snapshots of individual nodes as a backup mechanism. You must use the Elasticsearch snapshot and restore feature to copy the cluster contents to a separate repository. Then, if desired, you can take a filesystem snapshot of this repository.
When restoring a repository from a backup, you must not register the repository with Elasticsearch until the repository contents are fully restored. If you alter the contents of a repository while it is registered with Elasticsearch then the repository may become unreadable or may silently lose some of its contents.