Elasticsearch for Apache Hadoop 2.0.3
See issues on GitHub
Release Notes
Bug fixes
- Json value extraction fails with mixed nested objects #455
- Enforce time zone for index formatting when none is specified #435
- failed unit test dateindexformattertest #433
- HeartBeat vs. mapreduce.task.timeout doesn't consider "0 == infinite" case #426
- elasticsearch and Hive integration on Yarn #393
- Duplicate documents returned on alias scan #363
- Dynamic es.resource.write fails to find nested field #362
- Elastic search Hive integration issues #359
- When using nested objects with MR the returned array does use the correct type #342
- java.util.Date cannot be cast to org.apache.hadoop.io.Writable #340
- java.lang.UnsupportedOperationException caused by org.elasticsearch.hadoop.mr.EsInputFormat ? #338
- group by error #331
- Fix writable serialization of date type with long values #320
- JSON serialization error #311
- Missing commons-cli dependency? #288
- Document Count in ES Different from Number of Entries Pushed #283
- Support external versioning of documents #343
- Can one increase number of partitions and hence spark nodes used? #339
- Fix nested type serialization #327
- java.io.NotSerializableException: org.apache.spark.SparkContext #298
- TaskAttemptId string is not properly formed #346
- sparksql cant INSERT a es table #330
- Hive Column Comments causing Hive Query to fail #322
Docs
- Incorrect parameter names es.update.params and es.update.params.json in config examples #430
- Update socks proxy configuration in configuration.adoc #419
- Update Spark doc for new API InputFormat #401 (issue: #390)
- Correcting Scala Code #389
- Better document the date formatting feature for dynamic writing #360
- Update index.adoc #318
- It is not clear if the plugin must be installed on every node #306
- Documentation on using raw json in pig #299
- Improve Pig file size to increase parallelism according to shard size #294