IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« rewrite parameter EQL search »

› ›

Regular expression syntax

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Regular expression syntax

edit

A regular expression is a way to match patterns in data using placeholder characters, called operators.

Elasticsearch supports regular expressions in the following queries:

Elasticsearch uses Apache Lucene's regular expression engine to parse these queries.

Reserved characters

edit

Lucene’s regular expression engine supports all Unicode characters. However, the following characters are reserved as operators:

. ? + * | { } [ ] ( ) " \

Depending on the optional operators enabled, the following characters may also be reserved:

# @ & < >  ~

To use one of these characters literally, escape it with a preceding backslash or surround it with double quotes. For example:

\@                  # renders as a literal '@'
\\                  # renders as a literal '\'
"john@smith.com"    # renders as 'john@smith.com'

The backslash is an escape character in both JSON strings and regular expressions. You need to escape both backslashes in a query, unless you use a language client, which takes care of this. For example, the string a\b needs to be indexed as "a\\b":

resp = client.index(
    index="my-index-000001",
    id="1",
    document={
        "my_field": "a\\b"
    },
)
print(resp)

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    my_field: 'a\\b'
  }
)
puts response

const response = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    my_field: "a\\b",
  },
});
console.log(response);

PUT my-index-000001/_doc/1
{
  "my_field": "a\\b"
}

Copy as curl Try in Elastic

This document matches the following regexp query:

resp = client.search(
    index="my-index-000001",
    query={
        "regexp": {
            "my_field.keyword": "a\\\\.*"
        }
    },
)
print(resp)

response = client.search(
  index: 'my-index-000001',
  body: {
    query: {
      regexp: {
        'my_field.keyword' => 'a\\\\.*'
      }
    }
  }
)
puts response

const response = await client.search({
  index: "my-index-000001",
  query: {
    regexp: {
      "my_field.keyword": "a\\\\.*",
    },
  },
});
console.log(response);

GET my-index-000001/_search
{
  "query": {
    "regexp": {
      "my_field.keyword": "a\\\\.*"
    }
  }
}

Copy as curl Try in Elastic

Standard operators

edit

Lucene’s regular expression engine does not use the Perl Compatible Regular Expressions (PCRE) library, but it does support the following standard operators.

.

Matches any character. For example:

ab.     # matches 'aba', 'abb', 'abz', etc.

?

Repeat the preceding character zero or one times. Often used to make the preceding character optional. For example:

abc?     # matches 'ab' and 'abc'

+

Repeat the preceding character one or more times. For example:

ab+     # matches 'ab', 'abb', 'abbb', etc.

*

Repeat the preceding character zero or more times. For example:

ab*     # matches 'a', 'ab', 'abb', 'abbb', etc.

{}

Minimum and maximum number of times the preceding character can repeat. For example:

a{2}    # matches 'aa'
a{2,4}  # matches 'aa', 'aaa', and 'aaaa'
a{2,}   # matches 'a` repeated two or more times

|

OR operator. The match will succeed if the longest pattern on either the left side OR the right side matches. For example:

abc|xyz  # matches 'abc' and 'xyz'

( … )

Forms a group. You can use a group to treat part of the expression as a single character. For example:

abc(def)?  # matches 'abc' and 'abcdef' but not 'abcd'

[ … ]

Match one of the characters in the brackets. For example:

[abc]   # matches 'a', 'b', 'c'

Inside the brackets, - indicates a range unless - is the first character or escaped. For example:

[a-c]   # matches 'a', 'b', or 'c'
[-abc]  # '-' is first character. Matches '-', 'a', 'b', or 'c'
[abc\-] # Escapes '-'. Matches 'a', 'b', 'c', or '-'

A ^ before a character in the brackets negates the character or range. For example:

[^abc]      # matches any character except 'a', 'b', or 'c'
[^a-c]      # matches any character except 'a', 'b', or 'c'
[^-abc]     # matches any character except '-', 'a', 'b', or 'c'
[^abc\-]    # matches any character except 'a', 'b', 'c', or '-'

Optional operators

edit

You can use the flags parameter to enable more optional operators for Lucene’s regular expression engine.

To enable multiple operators, use a | separator. For example, a flags value of COMPLEMENT|INTERVAL enables the COMPLEMENT and INTERVAL operators.

Valid values

edit

ALL (Default)

Enables all optional operators.

"" (empty string)

Alias for the ALL value.

COMPLEMENT

Enables the ~ operator. You can use ~ to negate the shortest following pattern. For example:

a~bc   # matches 'adc' and 'aec' but not 'abc'

EMPTY

Enables the # (empty language) operator. The # operator doesn’t match any string, not even an empty string.

If you create regular expressions by programmatically combining values, you can pass # to specify "no string." This lets you avoid accidentally matching empty strings or other unwanted strings. For example:

#|abc  # matches 'abc' but nothing else, not even an empty string

INTERVAL

Enables the <> operators. You can use <> to match a numeric range. For example:

foo<1-100>      # matches 'foo1', 'foo2' ... 'foo99', 'foo100'
foo<01-100>     # matches 'foo01', 'foo02' ... 'foo99', 'foo100'

INTERSECTION

Enables the & operator, which acts as an AND operator. The match will succeed if patterns on both the left side AND the right side matches. For example:

aaa.+&.+bbb  # matches 'aaabbb'

ANYSTRING

Enables the @ operator. You can use @ to match any entire string.

You can combine the @ operator with & and ~ operators to create an "everything except" logic. For example:

@&~(abc.+)  # matches everything except terms beginning with 'abc'

NONE

Disables all optional operators.

Unsupported operators

edit

Lucene’s regular expression engine does not support anchor operators, such as ^ (beginning of line) or $ (end of line). To match a term, the regular expression must match the entire string.

« rewrite parameter EQL search »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Regular expression syntax

Regular expression syntax

Reserved characters

Standard operators

Optional operators

Valid values

Unsupported operators

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards