Installation

edit

elasticsearch-hadoop binaries can be obtained either by downloading them from the elastic.co site as a ZIP (containing project jars, sources and documentation) or by using any Maven-compatible tool with the following dependency:

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop</artifactId>
  <version>2.0.3</version>
</dependency>

The jar above contains all the features of elasticsearch-hadoop and does not require any other dependencies at runtime; in other words it can be used as is.

elasticsearch-hadoop binary is suitable for both Hadoop 1.x and Hadoop 2.x (also known as YARN) environments without any changes.

Minimalistic binaries

edit

In addition to the uber jar, elasticsearch-hadoop provides minimalistic jars for each integration, tailored for those who use just one module (in all other situations the uber jar is recommended); the jars are smaller in size and use a dedicated pom, covering only the needed dependencies. These are available under the same groupId, using an artifactId with the pattern elasticsearch-hadoop-{integration}:

Map/Reduce.

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop-mr</artifactId> 
  <version>2.0.3</version>
</dependency>

mr artifact

Apache Hive.

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop-hive</artifactId> 
  <version>2.0.3</version>
</dependency>

hive artifact

Apache Pig.

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop-pig</artifactId> 
  <version>2.0.3</version>
</dependency>

pig artifact

Cascading.

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop-cascading</artifactId> 
  <version>2.0.3</version>
</dependency>

cascading artifact

Releases are available in the central Maven repository.

Development Builds

edit

Development (or nightly or snapshots) builds are published daily at sonatype-oss repository (see below). Make sure to use snapshot versioning:

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-hadoop</artifactId>
  <version>2.0.4.BUILD-SNAPSHOT</version> 
</dependency>

notice the BUILD-SNAPSHOT suffix indicating a development build

but also enable the dedicated snapshots repository :

<repositories>
  <repository>
    <id>sonatype-oss</id>
    <url>http://oss.sonatype.org/content/repositories/snapshots</url> 
    <snapshots><enabled>true</enabled></snapshots> 
  </repository>
</repositories>

add snapshot repository

enable snapshots capability on the repository otherwise these will not be found by Maven