IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Getting started with HDFS Hadoop Security »

› › ›

Configuration Properties

edit

Configuration Properties

edit

Once installed, define the configuration for the hdfs repository through the REST API:

PUT _snapshot/my_hdfs_repository
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://namenode:8020/",
    "path": "elasticsearch/repositories/my_hdfs_repository",
    "conf.dfs.client.read.shortcircuit": "true"
  }
}

Copy as curl Try in Elastic

The following settings are supported:

`uri`	The uri address for hdfs. ex: "hdfs://<host>:<port>/". (Required)
`path`	The file path within the filesystem where data is stored/loaded. ex: "path/to/file". (Required)
`load_defaults`	Whether to load the default Hadoop configuration or not. (Enabled by default)
`conf.<key>`	Inlined configuration parameter to be added to Hadoop configuration. (Optional) Only client oriented properties from the hadoop core and hdfs configuration files will be recognized by the plugin.
`compress`	Whether to compress the metadata or not. (Enabled by default)
`max_restore_bytes_per_sec`	Throttles per node restore rate. Defaults to unlimited. Note that restores are also throttled through recovery settings.
`max_snapshot_bytes_per_sec`	Throttles per node snapshot rate. Defaults to `40mb` per second. Note that if the recovery settings for managed services are set, then it defaults to unlimited, and the rate is additionally throttled through recovery settings.
`readonly`	Makes repository read-only. Defaults to `false`.
`chunk_size`	Override the chunk size. (Disabled by default)
`security.principal`	Kerberos principal to use when connecting to a secured HDFS cluster. If you are using a service principal for your elasticsearch node, you may use the `_HOST` pattern in the principal name and the plugin will replace the pattern with the hostname of the node at runtime (see Creating the Secure Repository).

A Note on HDFS Availability

edit

When you initialize a repository, its settings are persisted in the cluster state. When a node comes online, it will attempt to initialize all repositories for which it has settings. If your cluster has an HDFS repository configured, then all nodes in the cluster must be able to reach HDFS when starting. If not, then the node will fail to initialize the repository at start up and the repository will be unusable. If this happens, you will need to remove and re-add the repository or restart the offending node.

« Getting started with HDFS Hadoop Security »

On this page

A Note on HDFS Availability

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Configuration Properties

Configuration Properties

A Note on HDFS Availability

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards