Pod PreStop hook

edit

When an Elasticsearch Pod is terminated, its Endpoint is removed from the Service and the Elasticsearch process is terminated. As these two operations happen in parallel, a race condition exists. If the Elasticsearch process is already shut down, but the Endpoint is still a part of the Service, any new connection might fail. For more information, check Termination of pods.

Moreover, kube-proxy resynchronizes its rules every 30 seconds by default. During that time window of 30 seconds, the terminating Pod IP may still be used when targeting the service. Please note the resync operation itself may take some time, especially if kube-proxy is configured to use iptables with a lot of services and rules to apply.

To address this issue and minimize unavailability, ECK relies on a PreStop lifecycle hook. It waits for an additional PRE_STOP_ADDITIONAL_WAIT_SECONDS (defaulting to 50). The additional wait time is used to:

  1. Give time to in-flight requests to be completed.
  2. Give clients time to use the terminating Pod IP resolved just before DNS record was updated.
  3. Give kube-proxy time to refresh ipvs or iptables rules on all nodes, depending on its sync period setting.

The exact behavior is configurable using an environment variable, for example:

spec:
  version: 8.16.1
  nodeSets:
    - name: default
      count: 1
      podTemplate:
        spec:
          containers:
          - name: elasticsearch
            env:
            - name: PRE_STOP_ADDITIONAL_WAIT_SECONDS
              value: "5"