Requirements

edit

Before using Elasticsearch on YARN, please pay attention to the requirements below - ignoring them can lead to abnormal behavior, error and ultimately a poor experience and data loss.

make sure to verify all nodes in a cluster when checking the version of a certain artifact.

YARN

edit

A YARN environment running on Hadoop 2.4 (or higher) is recommended. This can be easily checked by verifying the Hadoop version installed on the target nodes:

$ hadoop version

Hadoop 2.4.1
Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1604318
Compiled by jenkins on 2014-06-21T05:43Z
Compiled with protoc 2.5.0
From source with checksum bb7ac0a3c73dc131f4844b873c74b630
This command was run using /opt/share/hadoop/common/hadoop-common-2.4.1.jar

For Hadoop distros, check the base core YARN/Hadoop version and make sure it is 2.4 compatible.

As a guide, the table below lists the Hadoop-based distributions that include YARN, against with this version has been tested against at various points in time:

Distribution Release

Apache Hadoop

2.7.x

Apache Hadoop

2.6.x

Apache Hadoop

2.5.x

Apache Hadoop

2.4.x

Amazon EMR

4.2.x

Amazon EMR

4.1.x

Amazon EMR

4.0.x

Amazon EMR

3.8.x

Amazon EMR

3.7.x

Amazon EMR

3.6.x

Amazon EMR

3.5.x

Amazon EMR

3.4.x

Amazon EMR

3.3.x

Amazon EMR

3.2.x

Amazon EMR

3.1.x

Cloudera CDH

5.5.x

Cloudera CDH

5.4.x

Cloudera CDH

5.3.x

Cloudera CDH

5.2.x

Cloudera CDH

5.1.x

Cloudera CDH

5.0.x

Hortonworks HDP

2.3.x

Hortonworks HDP

2.2.x

Hortonworks HDP

2.1.x

MapR

5.x

MapR

4.1.x

MapR

4.0.x

Elasticsearch

edit

Elasticsearch on YARN uses the same requirements on Elasticsearch as elasticsearch-hadoop - in other words, using the latest stable Elasticsearch is highly recommended for both stability and performance reasons.