Effective Elasticsearch Plugin Management with Docker
If you're running Elasticsearch within Docker containers, there are some important operational considerations to bear in mind. This is especially true when managing stateful services and daemons - when persisting data outside of an ephemeral container becomes important.
Using Elasticsearch plugins within containers is an example of this, both in terms of installing them in a repeatable, trackable manner and managing plugin configuration and data.
In this post, we'll explore some of the options to achieve sane plugin management within the context of Docker and Elasticsearch.
Note: In this guide we will reference the Elasticsearch image found on the Docker Hub. The development and production of this Docker image is not affiliated with Elastic, Inc.
A Docker Persistence Primer
Like a file change in a version control system, changing the filesystem in a running container introduces differences in the running image. Steps need to be taken if data and changes need to persist permanently outside the scope of an impermanent container.
Although one could leverage some more complex storage schemes to achieve container persistence, the basic Docker mechanism of bind mounts is the most illustrative. This is achieved by, for example, keeping Elasticsearch indices long-term in /data
by passing an option like -v /data:/usr/share/elasticsearch/data
to a docker run
command, which effectively stores your Elasticsearch data in the host's (not container's) /data
directory.
If the changes are more permanent, that is, they are expected to be there without changing over time, codifying changes in a Dockerfile
may make more sense. Both approaches are useful in different cases.
Managing Basic Plugins
In the following examples, I refer to "basic" plugins as those that do not require licenses or any other special components. Straightforward plugins like cloud-aws just need to place some files on the system to work.
In a case like this, managing the presence of the plugin is simplest by just extending the image using a Dockerfile
. For example, considering the following Dockerfile
:
FROM elasticsearch:2 RUN /usr/share/elasticsearch/bin/plugin install --batch cloud-aws
This Dockerfile
starts with the Elasticsearch image provided by maintainers at the Docker hub and runs a simple plugin install
command. This image can then be built and referenced by a tag:
$ docker build -t elasticsearch-aws .
When run in the same directory as the Dockerfile
, the image name elasticsearch-aws
is built and can now be referenced in future Docker commands when starting new containers with behavior inherited FROM
the original.
More Complex Plugins
Some plugins may require the presence of additional files (such as certificates when using Shield) for certain features. The aforementioned technique of building a custom image can handle the installation of these plugins, but managing configuration is a task better left to a different approach. This helps keep images generic for deployment re-use and maintains tighter control over secrets.
Note: Some commercial plugins require the presence of a license. In the following examples, we simply rely on the temporary trial license present by default. When deploying in production, license management is performed through the Elasticsearch REST API, which stores the license in Elasticsearch's data path. As long as your data is persisted appropriately through a volume mount or otherwise, your license will be saved within your cluster.
Example: Shield
As outlined in the Shield installation documentation, installing the license
and shield
plugins is a prerequisite, which we can achieve by using the previous strategy to build a derived image:
FROM elasticsearch:2 RUN /usr/share/elasticsearch/bin/plugin install --batch license RUN /usr/share/elasticsearch/bin/plugin install --batch shield
Then build the image to use for future steps:
$ docker build -t elasticsearch-shield .
At this point, if we volume mount a config directory into the container, Shield will pick up our settings. As an example configuration, consider the following directory structure:
$ tree config config ├── elasticsearch.yml ├── logging.yml ├── scripts └── shield ├── roles.yml ├── users └── users_roles 2 directories, 4 files
The logging.yml
contains a default logging configuration. In elasticsearch.yml
, binding to the wildcard 0.0.0.0
ensures we can reach the container:
$ cat elasticsearch.yml network.host: 0.0.0.0
For the Shield configuration, we have defined a single role, admin, along with a user called "example" with a password of "password" and added it to the admin role (you'll obviously want a more secure configuration than this!):
$ cat shield/roles.yml admin: cluster: all indices: '*': privileges: all $ cat shield/users example:$2a$10$ppZqjFEXgVE3yT/yQPsp4etGMdF4.RFCS9OOGwZGAp0l3lPh4/ALC $ cat shield/users_roles admin:example
Note: In this example we are using a password hash generated using the esusers
Shield utility.
We then start the container, passing in the volume for our configuration and exposing the REST port:
$ docker run -d -p 9200:9200 -v "$PWD/config":/usr/share/elasticsearch/config elasticsearch-shield
Elasticsearch should deny unauthenticated requests and permit access to the credentials used earlier (in this example it is assumed that Docker is exposing ports on the localhost):
$ curl -I -XGET -k https://localhost:9200/_cluster/health HTTP/1.1 401 Unauthorized WWW-Authenticate: Basic realm="shield" Content-Type: application/json; charset=UTF-8 Content-Length: 389 $ curl -I -XGET -k -u example:password https://localhost:9200/_cluster/health HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Content-Length: 389
Adding SSL/TLS
Like mounting Shield configuration files, SSL and TLS certificates can be similarly managed. Most of the steps outlined in the Shield SSL/TLS guide can be followed, bearing in mind that the CONFIG_DIR
directory is the path that we will be mounting into the container at runtime. The full extent of CA and certificate management is outside the scope of this tutorial, so we will assume here that you are using a correctly configured Java keystore file, referred to here as node01.jks
.
Exposing the keystore file is simply a matter of including it within the configuration directory that is mounted into the container.
$ tree config config ├── elasticsearch.yml ├── logging.yml ├── node01.jks ├── scripts └── shield ├── roles.yml ├── users └── users_roles $ file config/node01.jks config/node01.jks: Java KeyStore
Following the Shield user guide, we enable transport encryption:
$ cat config/elasticsearch.yml network.host: 0.0.0.0 shield.ssl.keystore.path: /usr/share/elasticsearch/config/node01.jks shield.ssl.keystore.password: password shield.transport.ssl: true shield.http.ssl: true
Note: In production, you may want tighter control over your keystore - in this example, we only locked the keystore with a generic password.
With the keystore file in place and SSL enabled, we can start the node and issue requests over HTTPS (in this case, passing -k
to curl to bypass a self-signed certificate):
$ docker run -d -p 9200:9200 -v "$PWD/config":/usr/share/elasticsearch/config elasticsearch-shield 87e51d000cc11d63fbedb8a61d58ab1723f4a598b13614272a3b9d7f36a7b223 $ curl -I -XGET -k -u example:password https://localhost:9200/_cluster/health HTTP/1.1 200 OK Content-Type: text/plain; charset=UTF-8 Content-Length: 0
When it comes time to distribute keystore files across the cluster, they could potentially be managed by a configuration management module or a similar approach.
Note: Pay close attention to the appropriate options to pass to the -ext
option when generating certificates for your nodes that will run within Docker. DNS names and IP addresses should correctly reflect the hostname or IP address that nodes will use to communicate with one another.
Summary
Although we've given a few concrete examples in this blog post, every deployment is different, and you should tailor your setup according to whatever promotes reliability, repeatability, and security in your environment. Generally speaking, following these guidelines should aid in a good plugin management scheme:
- Define your most generic steps in Docker images. By managing plugins early then running containers from those images, you can avoid re-running the same installation commands over and over.
- Maintain persistent data within volume mounts. Storing your index data and plugin configuration separately from the container image ensures that your data is controlled and containers remain ephemeral.
- Test! Before deploying any production infrastructure, ensure that Elasticsearch behaves as expected within Docker, especially in regards to network communication (including unicast and IP address binding behavior), JVM resource allocation, and plugin functionality.