Regexp Filter

edit

The regexp filter is similar to the regexp query, except that it is cacheable and can speedup performance in case you are reusing this filter in your queries.

See Regular expression syntax for details of the supported regular expression language.

{
    "filtered": {
        "query": {
            "match_all": {}
        },
        "filter": {
            "regexp":{
                "name.first" : "s.*y"
            }
        }
    }
}

You can also select the cache name and use the same regexp flags in the filter as in the query.

Regular expressions are dangerous because it’s easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute.

You have to enable caching explicitly in order to have the regexp filter cached.

{
    "filtered": {
        "query": {
            "match_all": {}
        },
        "filter": {
            "regexp":{
                "name.first" : {
                    "value" : "s.*y",
                    "flags" : "INTERSECTION|COMPLEMENT|EMPTY",
		    "max_determinized_states": 20000
                },
                "_name":"test",
                "_cache" : true,
                "_cache_key" : "key"
            }
        }
    }
}