Analysis changes

edit

Synonym Token Filter

edit

In 6.0, Synonym Token Filter tokenizes synonyms with whatever tokenizer and token filters appear before it in the chain.

The tokenizer and ignore_case parameters are deprecated and will be ignored when used in new indices. These parameters will continue to function as before when used in indices created in 5.x.

Limiting the length of an analyzed text during highlighting

edit

Highlighting a text that was indexed without offsets or term vectors, requires analysis of this text in memory real time during the search request. For large texts this analysis may take substantial amount of time and memory. To protect against this, the maximum number of characters that to be analyzed will be limited to 1000000 in the next major Elastic version. For this version, by default the limit is not set. A deprecation warning will be issued when an analyzed text exceeds 1000000. The limit can be set for a particular index with the index setting index.highlight.max_analyzed_offset.