Preface

edit

Elasticsearch for Apache Hadoop is an ‘umbrella’ project consisting of three similar, yet independent sub-projects with their own, dedicated, section in the documentation:

Elasticsearch on YARN
run Elasticsearch on top of YARN - see Elasticsearch on YARN
repository-hdfs
use HDFS as a repository back-end; that is storage for doing snapshot/restore from/to Elasticsearch. For more information refer to its home page
elasticsearch-hadoop proper
interact with Elasticsearch from within a Hadoop environment. If you are using Map/Reduce, Cascading, Hive, Pig, Apache Spark or Apache Storm, this project is for you.

Thus, while all projects fall under the Hadoop umbrella, each is covering a certain aspect of it so please be sure to read the appropriate documentation.