WARNING: Version 2.0 has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Reference
editReference
editThis part of the documentation explains the core functionality of elasticsearch-hadoop starting with the configuration options and architecture and gradually explaining the various major features. We recommend going through the entire documentation even superficially when trying out elasticsearch-hadoop for the first time, however those in a rush, can jump directly to the desired sections:
- Architecture
- overview of the elasticsearch-hadoop architecture and how it maps on top of Map/Reduce
- Configuration
- explore the various configuration switches in elasticsearch-hadoop
- Map/Reduce integration
- describes how to use elasticsearch-hadoop in vanilla Map/Reduce environments - typically useful for those interested in data loading and saving to/from Elasticsearch without little, if any, ETL (extract-transform-load).
- Cascading support
- describes how to use Cascading and elasticsearch-hadoop.
- Apache Hive integration
- Hive users should refer to this section.
- Apache Pig support
- how-to on using Elasticsearch in Pig scripts through elasticsearch-hadoop.
- Apache Spark support
- describes how to use Apache Spark with Elasticsearch through elasticsearch-hadoop.
- Mapping and Types
- deep-dive into the strategies employed by elasticsearch-hadoop for doing type conversion and mapping to and from Elasticsearch.
- Hadoop Metrics
- Elasticsearch Hadoop metrics
- Troubleshooting
- tips on troubleshooting and getting help