- Elasticsearch Guide: other versions:
- Getting Started
- Setup Elasticsearch
- Breaking changes
- Breaking changes in 5.1
- Breaking changes in 5.0
- Search and Query DSL changes
- Mapping changes
- Percolator changes
- Suggester changes
- Index APIs changes
- Document API changes
- Settings changes
- Allocation changes
- HTTP changes
- REST API changes
- CAT API changes
- Java API changes
- Packaging
- Plugin changes
- Filesystem related changes
- Path to data on disk
- Aggregation changes
- Script related changes
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top hits Aggregation
- Value Count Aggregation
- Bucket Aggregations
- Children Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Shadow replica indices
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- cat APIs
- Cluster APIs
- Query DSL
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Tokenizers
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Compound Word Token Filter
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Fingerprint Token Filter
- Minhash Token Filter
- Character Filters
- Modules
- Index Modules
- Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Fail Processor
- Foreach Processor
- Grok Processor
- Gsub Processor
- Join Processor
- JSON Processor
- Lowercase Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- Dot Expander Processor
- How To
- Testing
- Glossary of terms
- Release Notes
- 5.1.2 Release Notes
- 5.1.1 Release Notes
- 5.1.0 Release Notes
- 5.0.2 Release Notes
- 5.0.1 Release Notes
- 5.0.0 Combined Release Notes
- 5.0.0 GA Release Notes
- 5.0.0-rc1 Release Notes
- 5.0.0-beta1 Release Notes
- 5.0.0-alpha5 Release Notes
- 5.0.0-alpha4 Release Notes
- 5.0.0-alpha3 Release Notes
- 5.0.0-alpha2 Release Notes
- 5.0.0-alpha1 Release Notes
- 5.0.0-alpha1 Release Notes (Changes previously released in 2.x)
WARNING: Version 5.1 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Painless Scripting Language
editPainless Scripting Language
editThe Painless scripting language is new and is still marked as experimental. The syntax or API may be changed in the future in non-backwards compatible ways if required.
Painless is a simple, secure scripting language available in Elasticsearch
by default. It is designed specifically for use with Elasticsearch and can
safely be used with inline
and stored
scripting, which is enabled by
default.
The Painless syntax is similar to Groovy.
You can use Painless anywhere a script can be used in Elasticsearch—simply set the lang
parameter
to painless
.
Painless Features
edit- Fast performance: several times faster than the alternatives.
- Safety: Fine-grained whitelist with method call/field granularity.
-
Optional typing: Variables and parameters can use explicit types or the dynamic
def
type. - Syntax: Extends Java’s syntax with a subset of Groovy for ease of use. See the Syntax Overview.
- Optimizations: Designed specifically for Elasticsearch scripting.
Painless Examples
editTo illustrate how Painless works, let’s load some hockey stats into an Elasticsearch index:
PUT hockey/player/_bulk?refresh {"index":{"_id":1}} {"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1]} {"index":{"_id":2}} {"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82]} {"index":{"_id":3}} {"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79]} {"index":{"_id":4}} {"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82]} {"index":{"_id":5}} {"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0]} {"index":{"_id":6}} {"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82]} {"index":{"_id":7}} {"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34]} {"index":{"_id":8}} {"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82]} {"index":{"_id":39}} {"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63]} {"index":{"_id":10}} {"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82]} {"index":{"_id":11}} {"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82]}
Accessing Doc Values from Painless
editDocument values can be accessed from a Map
named doc
.
For example, the following script calculates a player’s total goals. This example uses a strongly typed int
and a for
loop.
GET hockey/_search { "query": { "function_score": { "script_score": { "script": { "lang": "painless", "inline": "int total = 0; for (int i = 0; i < doc['goals'].length; ++i) { total += doc['goals'][i]; } return total;" } } } } }
Alternatively, you could do the same thing using a script field instead of a function score:
GET hockey/_search { "query": { "match_all": {} }, "script_fields": { "total_goals": { "script": { "lang": "painless", "inline": "int total = 0; for (int i = 0; i < doc['goals'].length; ++i) { total += doc['goals'][i]; } return total;" } } } }
The following example uses a Painless script to sort the players by their combined first and last names. The names are accessed using
doc['first'].value
and doc['last'].value
.
GET hockey/_search { "query": { "match_all": {} }, "sort": { "_script": { "type": "string", "order": "asc", "script": { "lang": "painless", "inline": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value" } } } }
Updating Fields with Painless
editYou can also easily update fields. You access the original source for a field as ctx._source.<field-name>
.
First, let’s look at the source data for a player by submitting the following request:
GET hockey/_search { "stored_fields": [ "_id", "_source" ], "query": { "term": { "_id": 1 } } }
To change player 1’s last name to hockey
, simply set ctx._source.last
to the new value:
POST hockey/player/1/_update { "script": { "lang": "painless", "inline": "ctx._source.last = params.last", "params": { "last": "hockey" } } }
You can also add fields to a document. For example, this script adds a new field that contains the player’s nickname, hockey.
POST hockey/player/1/_update { "script": { "lang": "painless", "inline": "ctx._source.last = params.last; ctx._source.nick = params.nick", "params": { "last": "gaudreau", "nick": "hockey" } } }
Regular expressions
editRegexes are disabled by default because they circumvent Painless’s
protection against long running and memory hungry scripts. To make matters
worse even innocuous looking regexes can have staggering performance and stack
depth behavior. They remain an amazing powerful tool but are too scary to enable
by default. To enable them yourself set script.painless.regex.enabled: true
in
elasticsearch.yml
. We’d like very much to have a safe alternative
implementation that can be enabled by default so check this space for later
developments!
Painless’s native support for regular expressions has syntax constructs:
-
/pattern/
: Pattern literals create patterns. This is the only way to create a pattern in painless. The pattern inside the `/`s are just Java regular expressions. See Pattern flags for more. -
=~
: The find operator return aboolean
,true
if a subsequence of the text matches,false
otherwise. -
==~
: The match operator returns aboolean
,true
if the text matches,false
if it doesn’t.
Using the find operator (=~
) you can update all hockey players with "b" in
their last name:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "if (ctx._source.last =~ /b/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}" } }
Using the match operator (==~
) you can update all the hockey players who’s
names start with a consonant and end with a vowel:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}" } }
You can use the Pattern.matcher
directly to get a Matcher
instance and
remove all of the vowels in all of their last names:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')" } }
Matcher.replaceAll
is just a call to Java’s Matcher
's
replaceAll
method so it supports $1
and \1
for replacements:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "ctx._source.last = /n([aeiou])/.matcher(ctx._source.last).replaceAll('$1')" } }
If you need more control over replacements you can call replaceAll
on a
CharSequence
with a Function<Matcher, String>
that builds the replacement.
This does not support $1
or \1
to access replacements because you already
have a reference to the matcher and can get them with m.group(1)
.
Calling Matcher.find
inside of the function that builds the
replacement is rude and will likely break the replacement process.
This will make all of the vowels in the hockey player’s last names upper case:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "ctx._source.last = ctx._source.last.replaceAll(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))" } }
Or you can use the CharSequence.replaceFirst
to make the first vowel in their
last names upper case:
POST hockey/player/_update_by_query { "script": { "lang": "painless", "inline": "ctx._source.last = ctx._source.last.replaceFirst(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))" } }
Note: all of the _update_by_query
examples above could really do with a
query
to limit the data that they pull back. While you could use a
Script Query it wouldn’t be as efficient as using any other query
because script queries aren’t able to use the inverted index to limit the
documents that they have to check.
Painless API
editThe following Java packages are available for use in the Painless language:
Note that unsafe classes and methods are not included, there is no support for:
- Manipulation of processes and threads
- Input/Output
- Reflection
On this page