Elasticsearch 5.0.0-alpha3 released
Today we are excited to announce the release of Elasticsearch 5.0.0-alpha3 based on Lucene 6.0.0. This is the third in a series of pre-5.0.0 releases designed to let you test our your application with the features and changes coming in 5.0.0, and to give us feedback about any problems that you encounter.
Open a bug report today and become an Elastic Pioneer.
IMPORTANT: This is an alpha release and is intended for testing purposes only. Indices created in this version will not be compatible with Elasticsearch 5.0.0 GA. Upgrading 5.0.0-alpha3 to any other version is not supported.
DO NOT DEPLOY IN PRODUCTION
Elasticsearch 5.0.0-alpha3 is close to being feature complete, and builds on the work released in 5.0.0-alpha2. There are many small changes which you can read about in the release notes above, but some of the more interesting ones are mentioned below.
Also take a look at the release announcements for Elasticsearch 5.0.0-alpha2 and Elasticsearch 5.0.0-alpha1 to read about features like:
- Ingest Node
- Painless Scripting
- Instant Aggregations
- Text/Keyword fields replacing String
- Completion Suggester v2
- Settings Validation
- Safety in Production
- Resiliencey Improvements
- Percolate Query
- Deleted Index Tombstones
The Elasticsearch Migration Helper is a site plugin designed to help you to prepare for your migration from Elasticsearch 2.3.x to Elasticsearch 5.0. It comes with three tools:
- Cluster Checkup
- Runs a series of checks on your cluster, nodes, and indices and alerts you to any known problems that need to be rectified before upgrading.
- Reindex Helper
- Indices created before v2.0.0 need to be reindexed before they can be used in Elasticsearch 5.x. The reindex helper upgrades old indices at the click of a button.
- Deprecation Logging
- Elasticsearch comes with a deprecation logger which will log a message whenever deprecated functionality is used. This tool enables or disables deprecation logging on your cluster.
Instruction for install the Elasticsearch migration helper.
Command line settings and system property settings have been refactored to reduce ambiguity. This is a small change which will impact every Elasticsearch installation, but has an easy migration path. Command line settings are now passed with the -E
parameter as follows:
./bin/elasticsearch -Epath.data=/path/to/data
System properties such as JVM heap size settings can be specified in the config/jvm.options
file or via the ES_JAVA_OPTS
environment variable:
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch
System properties can no longer be passed with the -D
command line syntax.
A number of system checks are performed during node startup. Any failing checks will produce a warning when in development mode (bound to localhost
) or a hard exception when in production mode (bound to a network interface other than localhost
). These failing checks should be fixed before going into production, for the good of your cluster. Sometimes, however, you need to bind to a public interface while still in development so we have added an escape hatch setting which allows you to downgrade these exceptions to warnings. Just set bootstrap.ignore_system_bootstrap_checks
to true
.
A change to reduce contention in the lock that prevents concurrent updates to the same document provides a 15-20% performance boost when indexing small documents. Another change, reducing the locking required while fsyncing the transaction log, provides another performance boost, especially when running on machines with lots of CPUs but with spinning disks.
The new enabled-by-default scripting language, Painless, has received a huge number of improvements:
- Dynamic scripting is now almost as fast as the much uglier static scripting.
- Single quotes can be used to make scripts easier to read.
-
Scripts now look more like scripts in other languages, without having to use the special
input
variable. - Exceptions provide context and more useful info for debugging problems.
- The syntax has been greatly expanded and supports more Java functions, as well as supporting geo-point fields.
Many more improvements will be made to Painless before GA.
In Elasticsearch 2.0 we removed support for dots in fields names because of the ambiguity between dotted fields and object fields. In 5.0, we have resolved this ambiguity and treat dots in field names as though they were an object. In other words, the following two documents are treated as thought they were the same:
PUT my_index/my_type/1 { "aaa.bbb.ccc": "some_val", "aaa.ddd": "other_val" } PUT my_index/my_type/2 { "aaa": { "bbb": { "ccc": "some_val" }, "ddd": "other_val" } }
Figuring out why a shard is unassigned can be tricky. You might need to look at the cluster-reroute API, the shard-stores API, the recovery API, index and cluster settings, and the logs. Now, we have a single API which pulls all of this information together: the Cluster Allocation Explain API.
Sometimes shards fail to be allocated because of some structural problem, like a missing stopwords or synonyms file. Previously, Elasticsearch would keep trying to allocate the shard forever, resulting in an torrent of log messages which would only stop once the disk filled up. Now we try to allocate a shard a maximum of 5 times before giving up. Once the problem has been diagnosed and fixed, shard allocation can be triggered with the Cluster Reroute API.
- Delete-by-query has been moved back to core, and is now implemented on top of the Reindex infrastructure.
- Snapshot/Restore can now use the Google Cloud Storage repository plugin.
- HTTP compression is enabled by default, although clients still have to specify that they want to use compression.
- The nodes-stats API once again returns I/O statistics on Linux.
Please download Elasticsearch 5.0.0-alpha3, try it out, and let us know what you think on Twitter (@elastic) or in our forum. You can report any problems on the GitHub issues page.