Elasticsearch Guide: other versions:
Getting Started
- Basic Concepts
- Installation
- Exploring Your Cluster
- Modifying Your Data
- Exploring Your Data
- Conclusion
Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Installing X-Pack
- Set up X-Pack
- Configuring X-Pack Java Clients
- X-Pack Settings
- Bootstrap Checks for X-Pack
Upgrade Elasticsearch
- Rolling upgrades
- Full cluster restart upgrade
- Reindex before upgrading
  - Reindex in place
  - Reindex from a remote cluster
API Conventions
- Multiple Indices
- Date math support in index names
- Common options
- URL-based access control
Document APIs
- Reading and Writing documents
- Index API
- Get API
- Delete API
- Delete By Query API
- Update API
- Update By Query API
- Multi Get API
- Bulk API
- Reindex API
- Term Vectors
- Multi termvectors API
- ?refresh
- Optimistic concurrency control
Search APIs
- Search
- URI Search
- Request Body Search
- Search Template
- Multi Search Template
- Search Shards API
- Suggesters
- Multi Search API
- Count API
- Validate API
- Explain API
- Profile API
- Field Capabilities API
- Ranking Evaluation API
Aggregations
- Metrics Aggregations
- Bucket Aggregations
- Pipeline Aggregations
- Matrix Aggregations
  - Matrix Stats
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
  - Explain Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
  - Synced Flush
- Refresh
- Force Merge
cat APIs
- cat aliases
- cat allocation
- cat count
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat thread pool
- cat shards
- cat segments
- cat snapshots
- cat templates
Cluster APIs
- Cluster Health
- Cluster State
- Cluster Stats
- Pending cluster tasks
- Cluster Reroute
- Cluster Update Settings
- Cluster Get Settings
- Nodes Stats
- Nodes Info
- Nodes Feature Usage
- Remote Cluster Info
- Task Management API
- Nodes hot_threads
- Cluster Allocation Explain API
Query DSL
- Query and filter context
- Match All Query
- Full text queries
- Term level queries
- Compound queries
- Joining queries
- Geo queries
- Specialized queries
- Span queries
- Minimum Should Match
- Multi Term Query Rewrite
Mapping
- Removal of mapping types
- Field datatypes
- Meta-Fields
- Mapping parameters
- Dynamic Mapping
Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Token Filters
- Character Filters
Modules
- Cluster
- Discovery
- Local Gateway
- HTTP
- Indices
- Network Settings
- Node
- Plugins
- Scripting
- Snapshot And Restore
- Thread Pool
- Transport
- Tribe node
- Remote clusters
- Cross-cluster search
Index Modules
- Analysis
- Index Shard Allocation
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Pre-loading data into the file system cache
- Translog
- Index Sorting
  - Use index sorting to speed up conjunctions
Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Processors
Managing the index lifecycle
- Getting started with index lifecycle management
- Policy phases and actions
- Set up index lifecycle management policy
  - Applying a policy to an index template
  - Apply a policy to a create index request
- Using policies to manage index rollover
  - Skipping Rollover
- Update policy
- Index lifecycle error handling
- Restoring snapshots of managed indices
- Start and stop index lifecycle management
SQL Access
- Overview
- Getting Started with SQL
- Conventions and Terminology
  - Mapping concepts across SQL and Elasticsearch
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
  - API usage
- SQL ODBC
  - Driver installation
  - Configuration
- SQL Client Applications
- SQL Language
- Lexical Structure
- SQL Commands
- Data Types
- Index patterns
- Functions and Operators
- Reserved keywords
- SQL Limitations
Monitor a cluster
- Overview
- How it works
- Monitoring in a production environment
- Collecting monitoring data
  - Pausing data collection
- Collecting monitoring data with Metricbeat
- Configuring indices for monitoring
- Configuring a tribe node to work with monitoring
- Collectors
- Exporters
  - Local Exporters
  - HTTP exporters
- Troubleshooting
Rolling up historical data
- Overview
- API Quick Reference
- Getting Started
- Understanding Groups
  - Grouping Limitations with heterogeneous indices
  - Doc counts and overlapping jobs
- Rollup Aggregation Limitations
- Rollup Search Limitations
Frozen indices
- Best practices
- Searching a frozen index
- Monitoring frozen indices
Set up a cluster for high availability
- Cross-cluster replication
Secure a cluster
- Overview
- Configuring security
- How security works
- User authentication
- Configuring SAML single-sign-on on the Elastic Stack
- User authorization
- Auditing security events
- Encrypting communications
  - Setting Up TLS on a cluster
- Restricting connections with IP filtering
- Cross cluster search, tribe, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Limitations
Alerting on Cluster and Index Events
- Getting Started with Watcher
- How Watcher works
- Encrypting sensitive data in Watcher
- Inputs
- Triggers
  - Schedule trigger
- Conditions
- Actions
- Transforms
- Java API
- Managing watches
- Example watches
  - Watching the status of an Elasticsearch cluster
  - Watching event data
- Troubleshooting
- Limitations
Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-migrate
- elasticsearch-saml-metadata
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
How To
- General recommendations
- Recipes
  - Mixing exact search with stemming
  - Getting consistent scoring
- Tune for indexing speed
- Tune for search speed
  - Tune your queries with the Profile API
- Tune for disk usage
Testing
- Java Testing Framework
Glossary of terms
X-Pack APIs
- Info API
- Cross-cluster replication APIs
- Explore API
- Freeze index
- Index lifecycle management API
- Licensing APIs
- Migration APIs
- Machine learning APIs
- Rollup APIs
- Security APIs
- Unfreeze index
- Watcher APIs
  - Put watch
  - Get watch
  - Delete watch
  - Execute watch
  - Ack watch
  - Activate watch
  - Deactivate watch
  - Stats
  - Stop
  - Start
  - Restart API
- Definitions
Release Highlights
- 6.7.0
- 6.6.0
- 6.5.0
- 6.4.0
- 6.3.0
Breaking changes
- 6.0
- 6.1
- 6.2
- 6.3
- 6.4
- 6.5
- 6.6
- 6.7
Release Notes
- Elasticsearch version 6.7.2
- Elasticsearch version 6.7.1
- Elasticsearch version 6.7.0
- Elasticsearch version 6.6.2
- Elasticsearch version 6.6.1
- Elasticsearch version 6.6.0
- Elasticsearch version 6.5.4
- Elasticsearch version 6.5.3
- Elasticsearch version 6.5.2
- Elasticsearch version 6.5.1
- Elasticsearch version 6.5.0
- Elasticsearch version 6.4.3
- Elasticsearch version 6.4.2
- Elasticsearch version 6.4.1
- Elasticsearch version 6.4.0
- Elasticsearch version 6.3.2
- Elasticsearch version 6.3.1
- Elasticsearch version 6.3.0
- Elasticsearch version 6.2.4
- Elasticsearch version 6.2.3
- Elasticsearch version 6.2.2
- Elasticsearch version 6.2.1
- Elasticsearch version 6.2.0
- Elasticsearch version 6.1.4
- Elasticsearch version 6.1.3
- Elasticsearch version 6.1.2
- Elasticsearch version 6.1.1
- Elasticsearch version 6.1.0
- Elasticsearch version 6.0.1
- Elasticsearch version 6.0.0
- Elasticsearch version 6.0.0-rc2
- Elasticsearch version 6.0.0-rc1
- Elasticsearch version 6.0.0-beta2
- Elasticsearch version 6.0.0-beta1
- Elasticsearch version 6.0.0-alpha2
- Elasticsearch version 6.0.0-alpha1
- Elasticsearch version 6.0.0-alpha1 (Changes previously released in 5.x)

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

› › ›

Regexp Query

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Regexp Query

edit

The regexp query allows you to use regular expression term queries. See Regular expression syntax for details of the supported regular expression language. The "term queries" in that first sentence means that Elasticsearch will apply the regexp to the terms produced by the tokenizer for that field, and not to the original text of the field.

Note: The performance of a regexp query heavily depends on the regular expression chosen. Matching everything like .* is very slow as well as using lookaround regular expressions. If possible, you should try to use a long prefix before your regular expression starts. Wildcard matchers like .*?+ will mostly lower performance.

GET /_search
{
    "query": {
        "regexp":{
            "name.first": "s.*y"
        }
    }
}

Copy as curl Try in Elastic

Boosting is also supported

GET /_search
{
    "query": {
        "regexp":{
            "name.first":{
                "value":"s.*y",
                "boost":1.2
            }
        }
    }
}

Copy as curl Try in Elastic

You can also use special flags

GET /_search
{
    "query": {
        "regexp":{
            "name.first": {
                "value": "s.*y",
                "flags" : "INTERSECTION|COMPLEMENT|EMPTY"
            }
        }
    }
}

Copy as curl Try in Elastic

Possible flags are ALL (default), ANYSTRING, COMPLEMENT, EMPTY, INTERSECTION, INTERVAL, or NONE. Please check the Lucene documentation for their meaning

Regular expressions are dangerous because it’s easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute.

GET /_search
{
    "query": {
        "regexp":{
            "name.first": {
                "value": "s.*y",
                "flags" : "INTERSECTION|COMPLEMENT|EMPTY",
                "max_determinized_states": 20000
            }
        }
    }
}

Copy as curl Try in Elastic

By default the maximum length of regex string allowed in a Regexp Query is limited to 1000. You can update the index.max_regex_length index setting to bypass this limit.

Regular expression syntax

edit

Regular expression queries are supported by the regexp and the query_string queries. The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators.

We will not attempt to explain regular expressions, but just explain the supported operators.

Standard operators

edit

Anchoring

Most regular expression engines allow you to match any part of a string. If you want the regexp pattern to start at the beginning of the string or finish at the end of the string, then you have to anchor it specifically, using ^ to indicate the beginning or $ to indicate the end.

Lucene’s patterns are always anchored. The pattern provided must match the entire string. For string "abcde":

ab.*     # match
abcd     # no match

Allowed characters

Any Unicode characters may be used in the pattern, but certain characters are reserved and must be escaped. The standard reserved characters are:

. ? + * | { } [ ] ( ) " \

If you enable optional features (see below) then these characters may also be reserved:

# @ & < >  ~

Any reserved character can be escaped with a backslash "\*" including a literal backslash character: "\\"

Additionally, any characters (except double quotes) are interpreted literally when surrounded by double quotes:

john"@smith.com"

Match any character

The period "." can be used to represent any character. For string "abcde":

ab...   # match
a.c.e   # match

One-or-more

The plus sign "+" can be used to repeat the preceding shortest pattern once or more times. For string "aaabbb":

a+b+        # match
aa+bb+      # match
a+.+        # match
aa+bbb+     # match

Zero-or-more

The asterisk "*" can be used to match the preceding shortest pattern zero-or-more times. For string "aaabbb":

a*b*        # match
a*b*c*      # match
.*bbb.*     # match
aaa*bbb*    # match

Zero-or-one

The question mark "?" makes the preceding shortest pattern optional. It matches zero or one times. For string "aaabbb":

aaa?bbb?    # match
aaaa?bbbb?  # match
.....?.?    # match
aa?bb?      # no match

Min-to-max

Curly brackets "{}" can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat. The allowed forms are:

{5}     # repeat exactly 5 times
{2,5}   # repeat at least twice and at most 5 times
{2,}    # repeat at least twice

For string "aaabbb":

a{3}b{3}        # match
a{2,4}b{2,4}    # match
a{2,}b{2,}      # match
.{3}.{3}        # match
a{4}b{4}        # no match
a{4,6}b{4,6}    # no match
a{4,}b{4,}      # no match

Grouping

Parentheses "()" can be used to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can be a group. For string "ababab":

(ab)+       # match
ab(ab)+     # match
(..)+       # match
(...)+      # no match
(ab)*       # match
abab(ab)?   # match
ab(ab)?     # no match
(ab){3}     # match
(ab){1,2}   # no match

Alternation

The pipe symbol "|" acts as an OR operator. The match will succeed if the pattern on either the left-hand side OR the right-hand side matches. The alternation applies to the longest pattern, not the shortest. For string "aabb":

aabb|bbaa   # match
aacc|bb     # no match
aa(cc|bb)   # match
a+|b+       # no match
a+b+|b+a+   # match
a+(b|c)+    # match

Character classes

Ranges of potential characters may be represented as character classes by enclosing them in square brackets "[]". A leading ^ negates the character class. The allowed forms are:

[abc]   # 'a' or 'b' or 'c'
[a-c]   # 'a' or 'b' or 'c'
[-abc]  # '-' or 'a' or 'b' or 'c'
[abc\-] # '-' or 'a' or 'b' or 'c'
[^abc]  # any character except 'a' or 'b' or 'c'
[^a-c]  # any character except 'a' or 'b' or 'c'
[^-abc]  # any character except '-' or 'a' or 'b' or 'c'
[^abc\-] # any character except '-' or 'a' or 'b' or 'c'

Note that the dash "-" indicates a range of characters, unless it is the first character or if it is escaped with a backslash.

For string "abcd":

ab[cd]+     # match
[a-d]+      # match
[^a-d]+     # no match

Optional operators

edit

These operators are available by default as the flags parameter defaults to ALL. Different flag combinations (concatenated with "|") can be used to enable/disable specific operators:

{
    "regexp": {
        "username": {
            "value": "john~athon<1-5>",
            "flags": "COMPLEMENT|INTERVAL"
        }
    }
}

Complement

The complement is probably the most useful option. The shortest pattern that follows a tilde "~" is negated. For instance, `"ab~cd" means:

Starts with a
Followed by b
Followed by a string of any length that it anything but c
Ends with d

For the string "abcdef":

ab~df     # match
ab~cf     # match
ab~cdef   # no match
a~(cb)def # match
a~(bc)def # no match

Enabled with the COMPLEMENT or ALL flags.

Interval

The interval option enables the use of numeric ranges, enclosed by angle brackets "<>". For string: "foo80":

foo<1-100>     # match
foo<01-100>    # match
foo<001-100>   # no match

Enabled with the INTERVAL or ALL flags.

Intersection

The ampersand "&" joins two patterns in a way that both of them have to match. For string "aaabbb":

aaa.+&.+bbb     # match
aaa&bbb         # no match

Using this feature usually means that you should rewrite your regular expression.

Enabled with the INTERSECTION or ALL flags.

Any string

The at sign "@" matches any string in its entirety. This could be combined with the intersection and complement above to express “everything except”. For instance:

@&~(foo.+)      # anything except string beginning with "foo"

Enabled with the ANYSTRING or ALL flags.

« Wildcard Query Fuzzy Query »

On this page

Regular expression syntax
Standard operators
Optional operators

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Regexp Query

Regexp Query

Regular expression syntax

Standard operators

Optional operators

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards