Using ES|QL across clusters

edit

Using ES|QL across clusters

edit

Cross-cluster search for ES|QL is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

With ES|QL, you can execute a single query across multiple clusters.

Prerequisites

edit
  • Cross-cluster search requires remote clusters. To set up remote clusters on Elasticsearch Service, see configure remote clusters on Elasticsearch Service. If you run Elasticsearch on your own hardware, see Remote clusters.

    To ensure your remote cluster configuration supports cross-cluster search, see Supported cross-cluster search configurations.

  • For full cross-cluster search capabilities, the local and remote cluster must be on the same subscription level.
  • The local coordinating node must have the remote_cluster_client node role.
  • If you use sniff mode, the local coordinating node must be able to connect to seed and gateway nodes on the remote cluster.

    We recommend using gateway nodes capable of serving as coordinating nodes. The seed nodes can be a subset of these gateway nodes.

  • If you use proxy mode, the local coordinating node must be able to connect to the configured proxy_address. The proxy at this address must be able to route connections to gateway and coordinating nodes on the remote cluster.
  • Cross-cluster search requires different security privileges on the local cluster and remote cluster. See Configure privileges for cross-cluster search and Remote clusters.

Remote cluster setup

edit

The following cluster update settings API request adds three remote clusters: cluster_one, cluster_two, and cluster_three.

response = client.cluster.put_settings(
  body: {
    persistent: {
      cluster: {
        remote: {
          cluster_one: {
            seeds: [
              '35.238.149.1:9300'
            ],
            skip_unavailable: true
          },
          cluster_two: {
            seeds: [
              '35.238.149.2:9300'
            ],
            skip_unavailable: false
          },
          cluster_three: {
            seeds: [
              '35.238.149.3:9300'
            ]
          }
        }
      }
    }
  }
)
puts response
PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "cluster_one": {
          "seeds": [
            "35.238.149.1:9300"
          ],
          "skip_unavailable": true
        },
        "cluster_two": {
          "seeds": [
            "35.238.149.2:9300"
          ],
          "skip_unavailable": false
        },
        "cluster_three": {  
          "seeds": [
            "35.238.149.3:9300"
          ]
        }
      }
    }
  }
}

Since skip_unavailable was not set on cluster_three, it uses the default of false. See the Optional remote clusters section for details.

Query across multiple clusters

edit

In the FROM command, specify data streams and indices on remote clusters using the format <remote_cluster_name>:<target>. For instance, the following ES|QL request queries the my-index-000001 index on a single remote cluster named cluster_one:

FROM cluster_one:my-index-000001
| LIMIT 10

Similarly, this ES|QL request queries the my-index-000001 index from three clusters:

  • The local ("querying") cluster
  • Two remote clusters, cluster_one and cluster_two
FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
| LIMIT 10

Likewise, this ES|QL request queries the my-index-000001 index from all remote clusters (cluster_one, cluster_two, and cluster_three):

FROM *:my-index-000001
| LIMIT 10

Enrich across clusters

edit

Enrich in ES|QL across clusters operates similarly to local enrich. If the enrich policy and its enrich indices are consistent across all clusters, simply write the enrich command as you would without remote clusters. In this default mode, ES|QL can execute the enrich command on either the querying cluster or the fulfilling clusters, aiming to minimize computation or inter-cluster data transfer. Ensuring that the policy exists with consistent data on both the querying cluster and the fulfilling clusters is critical for ES|QL to produce a consistent query result.

In the following example, the enrich with hosts policy can be executed on either the querying cluster or the remote cluster cluster_one.

FROM my-index-000001,cluster_one:my-index-000001
| ENRICH hosts ON ip
| LIMIT 10

Enrich with an ES|QL query against remote clusters only can also happen on the querying cluster. This means the below query requires the hosts enrich policy to exist on the querying cluster as well.

FROM cluster_one:my-index-000001,cluster_two:my-index-000001
| LIMIT 10
| ENRICH hosts ON ip

Enrich with coordinator mode

edit

ES|QL provides the enrich _coordinator mode to force ES|QL to execute the enrich command on the querying cluster. This mode should be used when the enrich policy is not available on the remote clusters or maintaining consistency of enrich indices across clusters is challenging.

FROM my-index-000001,cluster_one:my-index-000001
| ENRICH _coordinator:hosts ON ip
| SORT host_name
| LIMIT 10

Enrich with the _coordinator mode usually increases inter-cluster data transfer and workload on the querying cluster.

Enrich with remote mode

edit

ES|QL also provides the enrich _remote mode to force ES|QL to execute the enrich command independently on each fulfilling cluster where the target indices reside. This mode is useful for managing different enrich data on each cluster, such as detailed information of hosts for each region where the target (main) indices contain log events from these hosts.

In the below example, the hosts enrich policy is required to exist on all fulfilling clusters: the querying cluster (as local indices are included), the remote cluster cluster_one, and cluster_two.

FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
| ENRICH _remote:hosts ON ip
| SORT host_name
| LIMIT 10

A _remote enrich cannot be executed after a stats command. The following example would result in an error:

FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
| STATS COUNT(*) BY ip
| ENRICH _remote:hosts ON ip
| SORT host_name
| LIMIT 10

Multiple enrich commands

edit

You can include multiple enrich commands in the same query with different modes. ES|QL will attempt to execute them accordingly. For example, this query performs two enriches, first with the hosts policy on any cluster and then with the vendors policy on the querying cluster.

FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
| ENRICH hosts ON ip
| ENRICH _coordinator:vendors ON os
| LIMIT 10

A _remote enrich command can’t be executed after a _coordinator enrich command. The following example would result in an error.

FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
| ENRICH _coordinator:hosts ON ip
| ENRICH _remote:vendors ON os
| LIMIT 10

Excluding clusters or indices from ES|QL query

edit

To exclude an entire cluster, prefix the cluster alias with a minus sign in the FROM command, for example: -my_cluster:*:

FROM my-index-000001,cluster*:my-index-000001,-cluster_three:*
| LIMIT 10

To exclude a specific remote index, prefix the index with a minus sign in the FROM command, such as my_cluster:-my_index:

FROM my-index-000001,cluster*:my-index-*,cluster_three:-my-index-000001
| LIMIT 10

Optional remote clusters

edit

Cross-cluster search for ES|QL currently does not respect the skip_unavailable setting. As a result, if a remote cluster specified in the request is unavailable or failed, cross-cluster search for ES|QL queries will fail regardless of the setting.

We are actively working to align the behavior of cross-cluster search for ES|QL with other cross-cluster search APIs. This includes providing detailed execution information for each cluster in the response, such as execution time, selected target indices, and shards.

Query across clusters during an upgrade

edit

You can still search a remote cluster while performing a rolling upgrade on the local cluster. However, the local coordinating node’s "upgrade from" and "upgrade to" version must be compatible with the remote cluster’s gateway node.

Running multiple versions of Elasticsearch in the same cluster beyond the duration of an upgrade is not supported.

For more information about upgrades, see Upgrading Elasticsearch.