Elasticsearch version 6.1.0

edit

Elasticsearch version 6.1.0

edit

Breaking Changes

edit
Network
  • Allow only a fixed-size receive predictor #26165 (issue: #23185)
REST
Scroll
  • Fail queries with scroll that explicitely set request_cache #27342
Search
  • Add a limit to from + size in top_hits and inner hits. #26492 (issue: #11511)
Security
  • The certgen command now returns validation errors when it encounters problems reading from an input file (with the -in command option). Previously these errors might have been ignored or caused the command to abort with unclear messages. For more information, see elasticsearch-certgen.

Breaking Java Changes

edit
Aggregations
  • Moves deferring code into its own subclass #26421
Core
  • Unify Settings xcontent reading and writing #26739
Settings
  • Return List instead of an array from settings #26903
  • Remove Settings,put(Map<String,String>) #26785

Deprecations

edit
Aggregations
  • Deprecate global_ordinals_hash and global_ordinals_low_cardinality #26173 (issue: #26014)
Allocation
  • Add deprecation warning for negative index.unassigned.node_left.delayed_timeout #26832 (issue: #26828)
Analysis
Geo
  • [GEO] 6x Deprecate ShapeBuilders and decouple geojson parse logic #27345
Mapping
  • Deprecate the index_options parameter for numeric fields #26672 (issue: #21475)
Plugin Repository Azure
  • Azure repository: Move to named configurations as we do for S3 repository and secure settings #23405 (issues: #22762, #22763)
Search
  • doc: deprecate _primary and _replica shard option #26792 (issue: #26335)

New Features

edit
Aggregations
  • Aggregations: bucket_sort pipeline aggregation #27152 (issue: #14928)
  • Add composite aggregator #26800
Analysis
  • Added Bengali Analyzer to Elasticsearch with respect to the lucene update #26527
Ingest
Java High Level REST Client
  • Added Delete Index support to high-level REST client #27019 (issue: #25847)
Machine Learning
  • Added the ability to create job forecasts. This feature enables you to use historical behavior to predict the future behavior of your time series. You can create forecasts in Kibana or by using the forecast jobs API.

    You cannot create forecasts for jobs that were created in previous versions; this functionality is available only for jobs created in 6.1 or later.

  • Added overall buckets, which summarize bucket results for multiple jobs. For more information, see the get overall buckets API.
  • Added job groups, which you can use to manage or retrieve information from multiple jobs at once. Also updated many machine learning APIs to support groups and wildcard expressions in the job identifier.
Nested Docs
  • Multi-level Nested Sort with Filters #26395
Query DSL
  • Add terms_set query #27145 (issue: #26915)
  • Introduce sorted_after query for sorted index #26377
  • Add support for auto_generate_synonyms_phrase_query in match_query, multi_match_query, query_string and simple_query_string #26097
Search
Similarities
  • Add a scripted similarity. #25831
Suggesters
  • Expose duplicate removal in the completion suggester #26496 (issue: #23364)
  • Support must and should for context query in context suggester #26407 (issues: #24421, #24565)

Enhancements

edit
Aggregations
  • Allow aggregation sorting via nested aggregation #26683 (issue: #16838)
Allocation
  • Tie-break shard path decision based on total number of shards on path #27039 (issue: #26654)
  • Balance shards for an index more evenly across multiple data paths #26654 (issue: #16763)
  • Expand "NO" decision message in NodeVersionAllocationDecider #26542 (issue: #10403)
  • _reroute’s retry_failed flag should reset failure counter #25888 (issue: #25291)
Analysis
  • Add configurable max_token_length parameter to whitespace tokenizer #26749 (issue: #26643)
CRUD
  • Add wait_for_active_shards parameter to index open command #26682 (issue: #20937)
Core
  • Fix classes that can exit #27518
  • Replace empty index block checks with global block checks in template delete/put actions #27050 (issue: #10530)
  • Allow Uid#decodeId to decode from a byte array slice #26987 (issue: #26931)
  • Use separate searchers for "search visibility" vs "move indexing buffer to disk #26972 (issues: #15768, #26802, #26912, #3593)
  • Add ability to split shards #26931
  • Make circuit breaker mutations debuggable #26067 (issue: #25891)
Dates
Discovery
  • Stop responding to ping requests before master abdication #27329 (issue: #27328)
Engine
  • Ensure external refreshes will also refresh internal searcher to minimize segment creation #27253 (issue: #26972)
  • Move IndexShard#getWritingBytes() under InternalEngine #27209 (issue: #26972)
  • Refactor internal engine #27082
Geo
  • Add ignore_malformed to geo_shape fields #24654 (issue: #23747)
Ingest
  • add json-processor support for non-map json types #27335 (issue: #25972)
  • Introduce templating support to timezone/locale in DateProcessor #27089 (issue: #24024)
  • Add support for parsing inline script (#23824) #26846 (issue: #23824)
  • Consolidate locale parsing. #26400
  • Accept ingest simulate params as ints or strings #23885 (issue: #23823)
Internal
  • Avoid uid creation in ParsedDocument #27241
  • Upgrade to Lucene 7.1.0 snapshot version #26864 (issue: #26527)
  • Remove _index fielddata hack if cluster alias is present #26082 (issue: #25885)
Java High Level REST Client
  • Adjust RestHighLevelClient method modifiers #27238
  • Decouple BulkProcessor from ThreadPool #26727 (issue: #26028)
Logging
  • Add more information on failed_to_convert exception (#21946) #27034 (issue: #21946)
  • Improve shard-failed log messages. #26866
Machine Learning
  • Improved the way machine learning jobs are allocated to nodes, such that it is primarily determined by the estimated memory requirement of the job. If there is insufficient information about the job’s memory requirements, the allocation decision is based on job counts per node.
  • Increased the default value of the xpack.ml.max_open_jobs setting from 10 to 20. The allocation of jobs to nodes now considers memory usage as well as job counts, so it’s reasonable to permit more small jobs on a single node. For more information, see Machine Learning Settings.
  • Decreased the default model_memory_limit property value to 1 GB for new jobs. If you want to create a job that analyzes high cardinality fields, you can increase this property value. For more information, see Analysis Limits.
  • Improved analytics related to decay rates when predictions are very accurate.
  • Improved analytics related to detecting non-negative quantities and using this information to constrain analysis, predictions, and confidence intervals.
  • Improved periodic trough or spike detection.
  • Improved the speed of the aggregation of machine learning results.
  • Improved probability calculation performance.
  • Expedited bucket processing time in very large populations by determining when there are nearly duplicate values in a bucket and de-duplicating the samples that are added to the model.
  • Improved handling of periodically missing values.
  • Improved analytics related to diurnal periodicity.
  • Reduced memory usage during population analysis by releasing redundant memory after the bucket results are written.
  • Improved modeling of long periodic components, particularly when there is a long bucket span.
Mapping
  • Allow ip_range to accept CIDR notation #27192 (issue: #26260)
  • Deduplicate _field_names. #26550
  • Throw a better error message for empty field names #26543 (issue: #23348)
  • Stricter validation for min/max values for whole numbers #26137
  • Make FieldMapper.copyTo() always non-null. #25994
Monitoring
  • Added the new interval_ms field to monitoring documents. This field indicates the current collection interval for Elasticsearch or external monitored systems.
Nested Docs
  • Use the primary_term field to identify parent documents #27469 (issue: #24362)
  • Prohibit using nested_filter, nested_path and new nested Option at the same time in FieldSortBuilder #26490 (issue: #17286)
Network
  • Remove manual tracking of registered channels #27445 (issue: #27260)
  • Remove tcp profile from low level nio channel #27441 (issue: #27260)
  • Decouple ChannelFactory from Tcp classes #27286 (issue: #27260)
Percolator
  • Use Lucene’s CoveringQuery to select percolate candidate matches #27271 (issues: #26081, #26307)
  • Add support to percolate query to percolate multiple documents simultaneously #26418
  • Hint what clauses are important in a conjunction query based on fields #26081
  • Add support for selecting percolator query candidate matches containing range queries #25647 (issue: #21040)
Plugin Discovery EC2
  • update AWS SDK for ECS Task IAM support in discovery-ec2 #26479 (issue: #23039)
Plugin Lang Painless
  • Painless: Only allow Painless type names to be the same as the equivalent Java class. #27264
  • Allow for the Painless Definition to have multiple instances for white-listing #27096
  • Separate Painless Whitelist Loading from the Painless Definition #26540
  • Remove Sort enum from Painless Definition #26179
Plugin Repository Azure
  • Add azure storage endpoint suffix #26432 #26568 (issue: #26432)
  • Support for accessing Azure repositories through a proxy #23518 (issues: #23506, #23517)
Plugin Repository S3
Plugins
  • Plugins: Add versionless alias to all security policy codebase properties #26756 (issue: #26521)
  • Allow plugins to plug rescore implementations #26368 (issue: #26208)
Query DSL
Reindex API
  • Update by Query is modified to accept short script parameter. #26841 (issue: #24898)
  • reindex: automatically choose the number of slices #26030 (issues: #24547, #25582)
Rollover
  • Add size-based condition to the index rollover API #27160 (issue: #27004)
  • Add size-based condition to the index rollover API #27115 (issue: #27004)
Scripting
  • Script: Convert script query to a dedicated script context #26003
Search
  • Make fields optional in multi_match query and rely on index.query.default_field by default #27380
  • fix unnecessary logger creation #27349
  • ObjectParser : replace IllegalStateException with ParsingException #27302 (issue: #27147)
  • Uses norms for exists query if enabled #27237
  • Cross-cluster search: make remote clusters optional #27182 (issues: #26118, #27161)
  • Enhances exists queries to reduce need for _field_names #26930 (issue: #26770)
  • Change ParentFieldSubFetchPhase to create doc values iterator once per segment #26815
  • Change VersionFetchSubPhase to create doc values iterator once per segment #26809
  • Change ScriptFieldsFetchSubPhase to create search scripts once per segment #26808 (issue: #26775)
  • Make sure SortBuilders rewrite inner nested sorts #26532
  • Extend testing of build method in ScriptSortBuilder #26520 (issues: #17286, #26490)
  • Accept an array of field names and boosts in the index.query.default_field setting #26320 (issue: #25946)
  • Reject IPv6-mapped IPv4 addresses when using the CIDR notation. #26254 (issue: #26078)
  • Rewrite range queries with open bounds to exists query #26160 (issue: #22640)
Security
  • Added the manage_index_templates cluster privilege to the built-in role kibana_system. For more information, see Cluster Privileges and Built-in Roles.
  • Newly created or updated watches execute with the privileges of the user that last modified the watch.
  • Added log messages when a PEM key is found when a PEM certificate was expected (or vice versa) in the xpack.ssl.key or xpack.ssl.certificate settings.
  • Added the new certutil command to simplify the creation of certificates for use with the Elastic stack. For more information, see elasticsearch-certutil.
  • Added automatic detection of support for AES 256 bit TLS ciphers and enabled their use when the JVM supports them.
Sequence IDs
  • Only fsync global checkpoint if needed #27652
  • Log primary-replica resync failures #27421 (issues: #24841, #27418)
  • Lazy initialize checkpoint tracker bit sets #27179 (issue: #10708)
  • Returns the current primary_term for Get/MultiGet requests #27177 (issue: #26493)
Settings
  • Allow affix settings to specify dependencies #27161
  • Represent lists as actual lists inside Settings #26878 (issue: #26723)
  • Remove Settings#getAsMap() #26845
  • Replace group map settings with affix setting #26819
  • Throw exception if setting isn’t recognized #26569 (issue: #25607)
  • Settings: Move keystore creation to plugin installation #26329 (issue: #26309)
Snapshot/Restore
  • Remove XContentType auto detection in BlobStoreRepository #27480
  • Snapshot: Migrate TransportRequestHandler to TransportMasterNodeAction #27165 (issue: #27151)
  • Fix toString of class SnapshotStatus (#26851) #26852 (issue: #26851)
Stats
  • Adds average document size to DocsStats #27117 (issue: #27004)
  • Stats to record how often the ClusterState diff mechanism is used successfully #27107 (issue: #26973)
  • Expose adaptive replica selection stats in /_nodes/stats API #27090
  • Add cgroup memory usage/limit to OS stats on Linux #26166
  • Add segment attributes to the _segments API. #26157 (issue: #26130)
Suggesters
  • Improve error message for parse failures of completion fields #27297
  • Support AND operation for context query in context suggester #24565 (issue: #24421)
Watcher
  • Improved error messages when there are no accounts configured for Watcher.
  • Added thread pool rejection information to execution state, which makes it easier to debug execution failures.
  • Added execution state information to watch status details. It is stored in the status.execution_state field.
  • Enabled the account monitoring url field in the xpack.notification.jira setting to support customized paths. For more information about configuring Jira accounts for use with watches, see Jira Action.
  • Improved handling of exceptions in Watcher to make it easier to debug problems.

Bug Fixes

edit
Aggregations
  • Disable the "low cardinality" optimization of terms aggregations. #27545 (issue: #27543)
  • scripted_metric _agg parameter disappears if params are provided #27159 (issues: #19768, #19863)
Cluster
  • Properly format IndexGraveyard deletion date as date #27362
  • Remove optimisations to reuse objects when applying a new ClusterState #27317
Core
  • Do not set data paths on no local storage required #27587 (issue: #27572)
  • Ensure threadcontext is preserved when refresh listeners are invoked #27565
  • Ensure logging is configured for CLI commands #27523 (issue: #27521)
  • Protect shard splitting from illegal target shards #27468 (issue: #26931)
  • Avoid NPE when getting build information #27442
  • Fix ShardSplittingQuery to respect nested documents. #27398 (issue: #27378)
  • When building Settings do not set SecureSettings if empty #26988 (issue: #316)
Engine
  • Reset LiveVersionMap on sync commit #27534 (issue: #27516)
  • Carry over version map size to prevent excessive resizing #27516 (issue: #20498)
Geo
  • Correct two equality checks on incomparable types #27688
  • [GEO] fix pointsOnly bug for MULTIPOINT #27415
Index Templates
  • Prevent constructing an index template without index patterns #27662
Ingest
  • Add pipeline support for REST API bulk upsert #27075 (issue: #25601)
  • Fixing Grok pattern for Apache 2.4 #26635
Inner Hits
  • Return an empty _source for nested inner hit when filtering on a field that doesn’t exist #27531
Internal
  • When checking if key exists in ThreadContextStruct:putHeaders() method,should put requestHeaders in map first #26068
  • Adding a refresh listener to a recovering shard should be a noop #26055
Java High Level REST Client
  • Register ip_range aggregation with the high level client #26383
  • add top hits as a parsed aggregation to the rest high level client #26370
Machine Learning
  • Improved handling of scenarios where there are insufficient values to interpolate trend components.
  • Improved calculation of confidence intervals.
  • Fixed degrees of freedom calculation that could lead to excessive error logging.
  • Improved trend modeling with long bucket spans.
  • Fixed timing of when model size statistics are written. Previously, if there were multiple partitions, there could be multiple model size stats docs written within the same bucket.
  • Updated the calculation of the model memory to include the memory used by partition, over, by, or influencer fields.
  • Fixed calculation of the frequency property value for datafeeds that use aggregations. The value must be a multiple of the histogram interval. For more information, see Aggregating Data for Faster Performance.
  • Removed unnecessary messages from logs when a job is forcefully closed.
Mapping
  • Fix dynamic mapping update generation. #27467
  • Fix merging of _meta field #27352 (issue: #27323)
  • Fixed rounding of bounds in scaled float comparison #27207 (issue: #27189)
Nested Docs
  • Ensure nested documents have consistent version and seq_ids #27455
  • Prevent duplicate fields when mixing parent and root nested includes #27072 (issue: #26990)
Network
  • Throw UOE from compressible bytes stream reset #27564 (issue: #24927)
  • Bubble exceptions when closing compressible streams #27542 (issue: #27540)
  • Do not set SO_LINGER on server channels #26997
  • Do not set SO_LINGER to 0 when not shutting down #26871 (issue: #26764)
  • Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets #26764 (issue: #26701)
Packaging
  • Removes minimum master nodes default number #26803
  • setgid on /etc/elasticearch on package install #26412 (issue: #26410)
Percolator
  • Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024 #27519 (issue: #1)
Plugin Analysis ICU
  • Catch InvalidPathException in IcuCollationTokenFilterFactory #27202
Plugin Lang Painless
  • Painless: Fix variable scoping issue in lambdas #27571 (issue: #26760)
  • Painless: Fix errors allowing void to be assigned to def. #27460 (issue: #27210)
Plugin Repository GCS
  • Create new handlers for every new request in GoogleCloudStorageService #27339 (issue: #27092)
Recovery
  • Flush old indices on primary promotion and relocation #27580 (issue: #27536)
Reindex API
  • Reindex: Fix headers in reindex action #26937 (issue: #22976)
Scroll
  • Fix scroll query with a sort that is a prefix of the index sort #27498
Search
  • Fix profiling naming issues #27133
  • Fix max score tracking with field collapsing #27122 (issue: #23840)
  • Apply missing request options to the expand phase #27118 (issues: #26649, #27079)
  • Calculate and cache result when advanceExact is called #26920 (issue: #26817)
  • Filter unsupported relation for RangeQueryBuilder #26620 (issue: #26575)
  • Handle leniency for phrase query on a field indexed without positions #26388
Security
  • Fixed REST requests that required a body but did not validate it, resulting in null pointer exceptions.
Sequence IDs
  • Obey translog durability in global checkpoint sync #27641
  • Fix resync request serialization #27418 (issue: #24841)
Settings
  • Allow index settings to be reset by wildcards #27671 (issue: #27537)
Snapshot/Restore
  • Do not swallow exception in ChecksumBlobStoreFormat.writeAtomic() #27597
  • Delete shard store files before restoring a snapshot #27476 (issues: #20220, #26865)
  • Fix snapshot getting stuck in INIT state #27214 (issue: #27180)
  • Fix default value of ignore_unavailable for snapshot REST API (#25359) #27056 (issue: #25359)
  • Do not create directory on readonly repository (#21495) #26909 (issue: #21495)
Stats
  • Include internal refreshes in refresh stats #27615
  • Make Segment statistics aware of segments hold by internal readers #27558
  • Ensure doc_stats are changing even if refresh is disabled #27505
Watcher
  • Fixed handling of watcher templates. Missing watcher templates can be added by any node if that node has a higher version than the master node.

Upgrades

edit
Core
  • Upgrade to Jackson 2.8.10 #27230
  • Upgrade to Lucene 7.1 #27225
Plugin Discovery EC2
Plugin Discovery GCE
  • Update Google SDK to version 1.23.0 #27381 (issue: #26636)
Plugin Lang Painless
  • Upgrade Painless from ANTLR 4.5.1-1 to ANTLR 4.5.3. #27153