A newer version is available. For the latest information, see the current release documentation.

« Advanced Elasticsearch node scheduling Remote clusters »

› › ›

Create automated snapshots

To set up automated snapshots for Elasticsearch on Kubernetes you have to:

Register the snapshot repository with the Elasticsearch API.
Set up a Snapshot Lifecycle Management Policy through API or the Kibana UI

Support for S3, GCS and Azure repositories is bundled in Elasticsearch by default from version 8.0. On older versions of Elasticsearch, or if another snapshot repository plugin should be used, you have to Install a snapshot repository plugin.

For more information on Elasticsearch snapshots, check Snapshot and Restore in the Elasticsearch documentation.

What follows is a non-exhaustive list of configuration examples. The first example might be worth reading even if you are targeting a Cloud provider other than GCP as it covers adding snapshot repository credentials to the Elasticsearch keystore and illustrates the basic workflow of setting up a snapshot repository:

Basic snapshot repository setup using GCS as an example

The next two examples cover approaches that use Cloud-provider specific means to leverage Kubernetes service accounts to avoid having to configure snapshot repository credentials in Elasticsearch:

The final example illustrates how to configure secure and trusted communication when you

Use S3-compatible services

The Elasticsearch GCS repository plugin requires a JSON file that contains service account credentials. These need to be added as secure settings to the Elasticsearch keystore. For more details, check Google Cloud Storage Repository.

Using ECK, you can automatically inject secure settings into a cluster node by providing them through a secret in the Elasticsearch Spec.

Create a file containing the GCS credentials. For this example, name it gcs.client.default.credentials_file. The file name is important as it is reflected in the secure setting.

{
  "type": "service_account",
  "project_id": "your-project-id",
  "private_key_id": "...",
  "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  "client_email": "service-account-for-your-repository@your-project-id.iam.gserviceaccount.com",
  "client_id": "...",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://accounts.google.com/o/oauth2/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/your-bucket@your-project-id.iam.gserviceaccount.com"
}

Create a Kubernetes secret from that file:

kubectl create secret generic gcs-credentials --from-file=gcs.client.default.credentials_file

Edit the secureSettings section of the Elasticsearch resource:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  version: 8.17.2
  # Inject secure settings into Elasticsearch nodes from a k8s secret reference
  secureSettings:
  - secretName: gcs-credentials

If you haven’t followed these instructions and named your GCS credentials file differently, you can still map it to the expected name now. Check Secure Settings for details.

Apply the modifications:
```
kubectl apply -f elasticsearch.yml
```

GCS credentials are automatically propagated into each Elasticsearch node’s keystore. It can take up to a few minutes, depending on the number of secrets in the keystore. You don’t have to restart the nodes.

Register the repository in Elasticsearch

edit

Create the GCS snapshot repository in Elasticsearch. You can either use the Snapshot and Restore UI in Kibana version 7.4.0 or higher, or follow the procedure described in Snapshot and Restore:
```
PUT /_snapshot/my_gcs_repository
{
  "type": "gcs",
  "settings": {
    "bucket": "my_bucket",
    "client": "default"
  }
}
```
Take a snapshot with the following HTTP request:
```
PUT /_snapshot/my_gcs_repository/test-snapshot
```

Use GKE Workload Identity

edit

GKE Workload Identity allows a Kubernetes service account to impersonate a Google Cloud IAM service account and therefore to configure a snapshot repository in Elasticsearch without storing Google Cloud credentials in Elasticsearch itself. This feature requires your Kubernetes cluster to run on GKE and your Elasticsearch cluster to run at least version 7.13 and version 8.1 when using searchable snapshots.

Follow the instructions in the GKE documentation to configure workload identity, specifically:

Create or update your Kubernetes cluster with --workload-pool=PROJECT_ID.svc.id.goog enabled, where PROJECT_ID is your Google project ID
Create a namespace and a Kubernetes service account (test-gcs and gcs-sa in this example)
Create the bucket, the Google service account (gcp-sa in this example. Note that both Google and Kubernetes have the concept of a service account and this example is referring to the former) and set the relevant permissions through Google Cloud console or gcloud CLI

Allow the Kubernetes service account to impersonate the Google service account:

gcloud iam service-accounts add-iam-policy-binding gcp-sa@PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[test-gcs/gcs-sa]"

Add the iam.gke.io/gcp-service-account annotation on the Kubernetes service account

kubectl annotate serviceaccount gcs-sa \
    --namespace test-gcs \
    iam.gke.io/gcp-service-account=gcp-sa@PROJECT_ID.iam.gserviceaccount.com

Create an Elasticsearch cluster, referencing the Kubernetes service account

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-gcs-sample
  namespace: test-gcs
spec:
  version: 8.17.2
  nodeSets:
  - name: default
    podTemplate:
      spec:
        automountServiceAccountToken: true
        serviceAccountName: gcs-sa
    count: 3

Create the snapshot repository as described in Register the repository in Elasticsearch

Use AWS IAM roles for service accounts (IRSA)

edit

The AWS IAM roles for service accounts feature allows you to give Elasticsearch restricted access to a S3 bucket without having to expose and store AWS credentials directly in Elasticsearch. This requires you to run the ECK operator on Amazon’s EKS offering and an Elasticsearch cluster running at least version 8.1.

Follow the AWS documentation to set this feature up. Specifically you need to:

Define an IAM policy file, called iam-policy.json in this example, giving access to an S3 bucket called my_bucket

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucketMultipartUploads",
                "s3:ListBucketVersions",
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::my_bucket"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:AbortMultipartUpload",
                "s3:DeleteObject",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": "arn:aws:s3:::my_bucket/*"
        }
    ]
}

Create the policy using AWS CLI tooling, using the name eck-snapshots in this example

aws iam create-policy \
    --policy-name eck-snapshots \
    --policy-document file://iam-policy.json

Use eksctl to create an IAM role and create and annotate a Kubernetes service account with it. The service account is called aws-sa in the default namespace in this example.
```
eksctl create iamserviceaccount \
  --name aws-sa \
  --namespace default \
  --cluster YOUR_CLUSTER \ 
  --attach-policy-arn arn:aws:iam::YOUR_IAM_ARN:policy/eck-snapshots \ 
  --approve
```
Replace YOUR_CLUSTER with your actual EKS cluster name

Replace with the actual AWS IAM ARN for the policy you just created

Create an Elasticsearch cluster referencing the service account

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: es
spec:
  version: 8.17.2
  nodeSets:
  - name: default
    count: 3
    podTemplate:
      spec:
        serviceAccountName: aws-sa
        containers:
        - name: elasticsearch
          env:
          - name: AWS_WEB_IDENTITY_TOKEN_FILE
            value: "/usr/share/elasticsearch/config/repository-s3/aws-web-identity-token-file" 
          - name: AWS_ROLE_ARN
            value: "arn:aws:iam::YOUR_ROLE_ARN_HERE" 
          volumeMounts:
          - name: aws-iam-token
            mountPath: /usr/share/elasticsearch/config/repository-s3
        volumes:
          - name: aws-iam-token
            projected:
              sources:
              - serviceAccountToken:
                  audience: sts.amazonaws.com
                  expirationSeconds: 86400
                  path: aws-web-identity-token-file

	Elasticsearch expects the service account token to be projected to exactly this path
	Replace with the actual `AWS_ROLE_ARN` for the IAM role you created in step 3

Create the snapshot repository as described in Register the repository in Elasticsearch but of type s3
```
PUT /_snapshot/my_s3_repository
{
  "type": "s3",
  "settings": {
    "bucket": "my_bucket"
  }
}
```

Use S3-compatible services

edit

The following example assumes that you have deployed and configured a S3 compatible object store like MinIO that can be reached from the Kubernetes cluster, and also that you have created a bucket in said service, called es-repo in this example. The example also assumes an Elasticsearch cluster named es is deployed within the cluster. Most importantly the steps describing how to customize the JVM trust store are only necessary if your S3-compatible service is using TLS certificates that are not issued by a well known certificate authority.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: es
spec:
  version: 8.17.2
  nodeSets:
  - name: mixed
    count: 3

Extract the cacerts JVM trust store from one of the running Elasticsearch nodes.
```
kubectl cp es-es-mixed-0:/usr/share/elasticsearch/jdk/lib/security/cacerts cacerts
```
You can skip this step if you want to create a new trust store that does not contain any well known CAs that Elasticsearch trusts by default. Be aware that this limits Elasticsearch’s ability to communicate with TLS secured endpoints to those for which you add CA certificates in the next steps.
Obtain the CA certificate used to sign the certificate of your S3-compatible service. We assume it is called tls.crt
Add the certificate to the JVM trust store from step 1
```
keytool -importcert -keystore cacerts -storepass changeit -file tls.crt -alias my-custom-s3-svc
```
You need to have the Java Runtime environment with the keytool installed locally for this step. changeit is the default password used by the JVM, but it can be changed with keytool as well.

Create a Kubernetes secret with the amended trust store

kubectl create secret generic custom-truststore --from-file=cacerts

Create a Kubernetes secret with the credentials for your object store bucket

kubectl create secret generic snapshot-settings \
   --from-literal=s3.client.default.access_key=$YOUR_ACCESS_KEY \
   --from-literal=s3.client.default.secret_key=$YOUR_SECRET_ACCESS_KEY

Update your Elasticsearch cluster to use the trust store and credentials from the Kubernetes secrets

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: es
spec:
  version: 8.17.2
  secureSettings:
  - secretName: snapshot-settings
  nodeSets:
  - name: mixed
    count: 3
    podTemplate:
      spec:
        volumes:
        - name: custom-truststore
          secret:
            secretName: additional-certs
        containers:
        - name: elasticsearch
          volumeMounts:
          - name: custom-truststore
            mountPath: /usr/share/elasticsearch/config/custom-truststore
          env:
          - name: ES_JAVA_OPTS
            value: "-Djavax.net.ssl.trustStore=/usr/share/elasticsearch/config/custom-truststore/cacerts -Djavax.net.ssl.keyStorePassword=changeit"

Create the snapshot repository

POST _snapshot/my_s3_repository
{
  "type": "s3",
  "settings": {
    "bucket": "es-repo",
    "path_style_access": true,	
    "endpoint": "https://mys3service.default.svc.cluster.local/" 
  }
}

	Whether or not you need to enable `path_style_access` depends on your choice of S3-compatible storage service and how it is deployed. If it is exposed through a standard Kubernetes service it is likely you need this option
	Replace this with the actual endpoint of your S3-compatible service

Install a snapshot repository plugin

edit

If you are running a version of Elasticsearch before 8.0 or you need a snapshot repository plugin that is not already pre-installed you have to install the plugin yourself. To install the snapshot repository plugin, you can either use a custom image or add your own init container which installs the plugin when the Pod is created.

To use your own custom image with all necessary plugins pre-installed, use an Elasticsearch resource like the following:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  version: 8.17.2
  image: your/custom/image:tag
  nodeSets:
  - name: default
    count: 1

Alternatively, install the plugin when the Pod is created by using an init container:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  version: 8.17.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      spec:
        initContainers:
        - name: install-plugins
          command:
          - sh
          - -c
          - |
            bin/elasticsearch-plugin install --batch repository-gcs

Assuming you stored this in a file called elasticsearch.yaml you can in both cases create the Elasticsearch cluster with:

kubectl apply -f elasticsearch.yaml

« Advanced Elasticsearch node scheduling Remote clusters »

	Replace `YOUR_CLUSTER` with your actual EKS cluster name
	Replace with the actual AWS IAM ARN for the policy you just created

Create automated snapshots