Elasticsearch for Apache Hadoop version 6.0.0-beta1

edit

Elasticsearch for Apache Hadoop version 6.0.0-beta1

edit

Tested against the latest and greatest Elasticsearch 6.0.0-beta1, the first beta of ES-Hadoop for 6.0.0 includes much needed fixes and features to work in harmony with all the hearty changes landing in Elasticsearch 6.0.

Please note that this is a beta release and that we do not recommend running this in production! See our Breaking Changes in 6.0 page for more information on what you might need to modify.

Breaking Changes

edit
  • Elasticsearch on YARN integration has been removed #1001 #1027
  • Using Hadoop versions 1.x is deprecated as of 5.5.0 #1001
  • ES-Hadoop will support Scala 2.11 by default in 6.0.0 #1001

New Features

edit
Serialization
  • Support Elasticsearch "join" types #1012

Enhancements

edit
Spark
Scripting
  • Allow users to use file based scripts during updates #918
  • Support script id/file/inline options #538
REST
  • Remove uses of features deprecated in 5.0 #881

Bug Fixes

edit
Serialization
  • Support nested collections of Java Bean classes in Spark #1021
  • Not able to use kryoserializer for writing data into Elasticsearch #1019
  • JacksonJsonGenerator.getParentPath will always be empty #1006
Pig
  • Using es.mapping.exclude/include and still getting StrictDynamicMappingException on excluded fields #1015
Spark
  • Spark JavaRDD give partial documents #946
REST
  • Remove unmatched format specifier from Restservice #1029

Documentation

edit
  • Unclear Docs (about multi-resource writes) #990