EQL search

edit

Event Query Language (EQL) is a query language for event-based time series data, such as logs, metrics, and traces.

Advantages of EQL

edit
  • EQL lets you express relationships between events.
    Many query languages allow you to match single events. EQL lets you match a sequence of events across different event categories and time spans.
  • EQL has a low learning curve.
    EQL syntax looks like other common query languages, such as SQL. EQL lets you write and read queries intuitively, which makes for quick, iterative searching.
  • EQL is designed for security use cases.
    While you can use it for any event-based data, we created EQL for threat hunting. EQL not only supports indicator of compromise (IOC) searches but can describe activity that goes beyond IOCs.

Required fields

edit

To run an EQL search, the searched data stream or index must contain a timestamp and event category field. By default, EQL uses the @timestamp and event.category fields from the Elastic Common Schema (ECS). To use a different timestamp or event category field, see Specify a timestamp or event category field.

While no schema is required to use EQL, we recommend using the ECS. EQL searches are designed to work with core ECS fields by default.

Run an EQL search

edit

Use the EQL search API to run a basic EQL query.

GET /my-data-stream/_eql/search
{
  "query": """
    process where process.name == "regsvr32.exe"
  """
}

By default, basic EQL queries return the 10 most recent matching events in the hits.events property. These hits are sorted by timestamp, converted to milliseconds since the Unix epoch, in ascending order.

{
  "is_partial": false,
  "is_running": false,
  "took": 60,
  "timed_out": false,
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "events": [
      {
        "_index": ".ds-my-data-stream-2099.12.07-000001",
        "_id": "OQmfCaduce8zoHT93o4H",
        "_source": {
          "@timestamp": "2099-12-07T11:07:09.000Z",
          "event": {
            "category": "process",
            "id": "aR3NWVOs",
            "sequence": 4
          },
          "process": {
            "pid": 2012,
            "name": "regsvr32.exe",
            "command_line": "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll",
            "executable": "C:\\Windows\\System32\\regsvr32.exe"
          }
        }
      },
      {
        "_index": ".ds-my-data-stream-2099.12.07-000001",
        "_id": "xLkCaj4EujzdNSxfYLbO",
        "_source": {
          "@timestamp": "2099-12-07T11:07:10.000Z",
          "event": {
            "category": "process",
            "id": "GTSmSqgz0U",
            "sequence": 6,
            "type": "termination"
          },
          "process": {
            "pid": 2012,
            "name": "regsvr32.exe",
            "executable": "C:\\Windows\\System32\\regsvr32.exe"
          }
        }
      }
    ]
  }
}

Use the size parameter to get a smaller or larger set of hits:

GET /my-data-stream/_eql/search
{
  "query": """
    process where process.name == "regsvr32.exe"
  """,
  "size": 50
}

Search for a sequence of events

edit

Use EQL’s sequence syntax to search for a series of ordered events. List the event items in ascending chronological order, with the most recent event listed last:

GET /my-data-stream/_eql/search
{
  "query": """
    sequence
      [ process where process.name == "regsvr32.exe" ]
      [ file where stringContains(file.name, "scrobj.dll") ]
  """
}

The response’s hits.sequences property contains the 10 most recent matching sequences.

{
  ...
  "hits": {
    "total": ...,
    "sequences": [
      {
        "events": [
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "OQmfCaduce8zoHT93o4H",
            "_source": {
              "@timestamp": "2099-12-07T11:07:09.000Z",
              "event": {
                "category": "process",
                "id": "aR3NWVOs",
                "sequence": 4
              },
              "process": {
                "pid": 2012,
                "name": "regsvr32.exe",
                "command_line": "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll",
                "executable": "C:\\Windows\\System32\\regsvr32.exe"
              }
            }
          },
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "yDwnGIJouOYGBzP0ZE9n",
            "_source": {
              "@timestamp": "2099-12-07T11:07:10.000Z",
              "event": {
                "category": "file",
                "id": "tZ1NWVOs",
                "sequence": 5
              },
              "process": {
                "pid": 2012,
                "name": "regsvr32.exe",
                "executable": "C:\\Windows\\System32\\regsvr32.exe"
              },
              "file": {
                "path": "C:\\Windows\\System32\\scrobj.dll",
                "name": "scrobj.dll"
              }
            }
          }
        ]
      }
    ]
  }
}

Use with maxspan to constrain matching sequences to a timespan:

GET /my-data-stream/_eql/search
{
  "query": """
    sequence with maxspan=1h
      [ process where process.name == "regsvr32.exe" ]
      [ file where stringContains(file.name, "scrobj.dll") ]
  """
}

Use the by keyword to match events that share the same field values:

GET /my-data-stream/_eql/search
{
  "query": """
    sequence with maxspan=1h
      [ process where process.name == "regsvr32.exe" ] by process.pid
      [ file where stringContains(file.name, "scrobj.dll") ] by process.pid
  """
}

If a field value should be shared across all events, use the sequence by keyword. The following query is equivalent to the previous one.

GET /my-data-stream/_eql/search
{
  "query": """
    sequence by process.pid with maxspan=1h
      [ process where process.name == "regsvr32.exe" ]
      [ file where stringContains(file.name, "scrobj.dll") ]
  """
}

The hits.sequences.join_keys property contains the shared field values.

{
  ...
  "hits": ...,
    "sequences": [
      {
        "join_keys": [
          2012
        ],
        "events": ...
      }
    ]
  }
}

Use the until keyword to specify an expiration event for sequences. Matching sequences must end before this event.

GET /my-data-stream/_eql/search
{
  "query": """
    sequence by process.pid with maxspan=1h
      [ process where process.name == "regsvr32.exe" ]
      [ file where stringContains(file.name, "scrobj.dll") ]
    until [ process where event.type == "termination" ]
  """
}

Retrieve selected fields

edit

By default, each hit in the search response includes the document _source, which is the entire JSON object that was provided when indexing the document.

You can use the filter_path query parameter to filter the API response. For example, the following search returns only the timestamp and PID from the _source of each matching event.

GET /my-data-stream/_eql/search?filter_path=hits.events._source.@timestamp,hits.events._source.process.pid
{
  "query": """
    process where process.name == "regsvr32.exe"
  """
}

The API returns the following response.

{
  "hits": {
    "events": [
      {
        "_source": {
          "@timestamp": "2099-12-07T11:07:09.000Z",
          "process": {
            "pid": 2012
          }
        }
      },
      {
        "_source": {
          "@timestamp": "2099-12-07T11:07:10.000Z",
          "process": {
            "pid": 2012
          }
        }
      }
    ]
  }
}

You can also use the fields parameter to retrieve and format specific fields in the response. This field is identical to the search API’s fields parameter.

Because it consults the index mappings, the fields parameter provides several advantages over referencing the _source directly. Specifically, the fields parameter:

  • Returns each value in a standardized way that matches its mapping type
  • Accepts multi-fields and field aliases
  • Formats dates and spatial data types
  • Retrieves runtime field values
  • Returns fields calculated by a script at index time

The following search request uses the fields parameter to retrieve values for the event.type field, all fields starting with process., and the @timestamp field. The request also uses the filter_path query parameter to exclude the _source of each hit.

GET /my-data-stream/_eql/search?filter_path=-hits.events._source
{
  "query": """
    process where process.name == "regsvr32.exe"
  """,
  "fields": [
    "event.type",
    "process.*",                
    {
      "field": "@timestamp",
      "format": "epoch_millis"  
    }
  ]
}

Both full field names and wildcard patterns are accepted.

Use the format parameter to apply a custom format for the field’s values.

The response includes values as a flat list in the fields section for each hit.

{
  ...
  "hits": {
    "total": ...,
    "events": [
      {
        "_index": ".ds-my-data-stream-2099.12.07-000001",
        "_id": "OQmfCaduce8zoHT93o4H",
        "fields": {
          "process.name": [
            "regsvr32.exe"
          ],
          "process.name.keyword": [
            "regsvr32.exe"
          ],
          "@timestamp": [
            "4100324829000"
          ],
          "process.command_line": [
            "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll"
          ],
          "process.command_line.keyword": [
            "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll"
          ],
          "process.executable.keyword": [
            "C:\\Windows\\System32\\regsvr32.exe"
          ],
          "process.pid": [
            2012
          ],
          "process.executable": [
            "C:\\Windows\\System32\\regsvr32.exe"
          ]
        }
      },
      ....
    ]
  }
}

Use runtime fields

edit

Use the runtime_mappings parameter to extract and create runtime fields during a search. Use the fields parameter to include runtime fields in the response.

The following search creates a day_of_week runtime field from the @timestamp and returns it in the response.

GET /my-data-stream/_eql/search?filter_path=-hits.events._source
{
  "runtime_mappings": {
    "day_of_week": {
      "type": "keyword",
      "script": "emit(doc['@timestamp'].value.dayOfWeekEnum.toString())"
    }
  },
  "query": """
    process where process.name == "regsvr32.exe"
  """,
  "fields": [
    "@timestamp",
    "day_of_week"
  ]
}

The API returns:

{
  ...
  "hits": {
    "total": ...,
    "events": [
      {
        "_index": ".ds-my-data-stream-2099.12.07-000001",
        "_id": "OQmfCaduce8zoHT93o4H",
        "fields": {
          "@timestamp": [
            "2099-12-07T11:07:09.000Z"
          ],
          "day_of_week": [
            "MONDAY"
          ]
        }
      },
      ....
    ]
  }
}

Specify a timestamp or event category field

edit

The EQL search API uses the @timestamp and event.category fields from the ECS by default. To specify different fields, use the timestamp_field and event_category_field parameters:

GET /my-data-stream/_eql/search
{
  "timestamp_field": "file.accessed",
  "event_category_field": "file.type",
  "query": """
    file where (file.size > 1 and file.type == "file")
  """
}

The event category field must be mapped as a keyword family field type. The timestamp field should be mapped as a date field type. date_nanos timestamp fields are not supported. You cannot use a nested field or the sub-fields of a nested field as the timestamp or event category field.

Specify a sort tiebreaker

edit

By default, the EQL search API returns matching hits by timestamp. If two or more events share the same timestamp, Elasticsearch uses a tiebreaker field value to sort the events in ascending order. Elasticsearch orders events with no tiebreaker value after events with a value.

If you don’t specify a tiebreaker field or the events also share the same tiebreaker value, Elasticsearch considers the events concurrent and may not return them in a consistent sort order.

To specify a tiebreaker field, use the tiebreaker_field parameter. If you use the ECS, we recommend using event.sequence as the tiebreaker field.

GET /my-data-stream/_eql/search
{
  "tiebreaker_field": "event.sequence",
  "query": """
    process where process.name == "cmd.exe" and stringContains(process.executable, "System32")
  """
}

Filter using Query DSL

edit

The filter parameter uses Query DSL to limit the documents on which an EQL query runs.

GET /my-data-stream/_eql/search
{
  "filter": {
    "range": {
      "@timestamp": {
        "gte": "now-1d/d",
        "lt": "now/d"
      }
    }
  },
  "query": """
    file where (file.type == "file" and file.name == "cmd.exe")
  """
}

Run an async EQL search

edit

By default, EQL search requests are synchronous and wait for complete results before returning a response. However, complete results can take longer for searches across large data sets or frozen data.

To avoid long waits, run an async EQL search. Set wait_for_completion_timeout to a duration you’d like to wait for synchronous results.

GET /my-data-stream/_eql/search
{
  "wait_for_completion_timeout": "2s",
  "query": """
    process where process.name == "cmd.exe"
  """
}

If the request doesn’t finish within the timeout period, the search becomes async and returns a response that includes:

  • A search ID
  • An is_partial value of true, indicating the search results are incomplete
  • An is_running value of true, indicating the search is ongoing

The async search continues to run in the background without blocking other requests.

{
  "id": "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
  "is_partial": true,
  "is_running": true,
  "took": 2000,
  "timed_out": false,
  "hits": ...
}

To check the progress of an async search, use the get async EQL search API with the search ID. Specify how long you’d like for complete results in the wait_for_completion_timeout parameter.

GET /_eql/search/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=?wait_for_completion_timeout=2s

If the response’s is_running value is false, the async search has finished. If the is_partial value is false, the returned search results are complete.

{
  "id": "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
  "is_partial": false,
  "is_running": false,
  "took": 2000,
  "timed_out": false,
  "hits": ...
}

Another more lightweight way to check the progress of an async search is to use the get async EQL status API with the search ID.

GET /_eql/search/status/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=
{
  "id": "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
  "is_running": false,
  "is_partial": false,
  "expiration_time_in_millis": 1611690295000,
  "completion_status": 200
}

Change the search retention period

edit

By default, the EQL search API stores async searches for five days. After this period, any searches and their results are deleted. Use the keep_alive parameter to change this retention period:

GET /my-data-stream/_eql/search
{
  "keep_alive": "2d",
  "wait_for_completion_timeout": "2s",
  "query": """
    process where process.name == "cmd.exe"
  """
}

You can use the get async EQL search API's keep_alive parameter to later change the retention period. The new retention period starts after the get request runs.

GET /_eql/search/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=?keep_alive=5d

Use the delete async EQL search API to manually delete an async EQL search before the keep_alive period ends. If the search is still ongoing, Elasticsearch cancels the search request.

DELETE /_eql/search/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=

Store synchronous EQL searches

edit

By default, the EQL search API only stores async searches. To save a synchronous search, set keep_on_completion to true:

GET /my-data-stream/_eql/search
{
  "keep_on_completion": true,
  "wait_for_completion_timeout": "2s",
  "query": """
    process where process.name == "cmd.exe"
  """
}

The response includes a search ID. is_partial and is_running are false, indicating the EQL search was synchronous and returned complete results.

{
  "id": "FjlmbndxNmJjU0RPdExBTGg0elNOOEEaQk9xSjJBQzBRMldZa1VVQ2pPa01YUToxMDY=",
  "is_partial": false,
  "is_running": false,
  "took": 52,
  "timed_out": false,
  "hits": ...
}

Use the get async EQL search API to get the same results later:

GET /_eql/search/FjlmbndxNmJjU0RPdExBTGg0elNOOEEaQk9xSjJBQzBRMldZa1VVQ2pPa01YUToxMDY=

Saved synchronous searches are still subject to the keep_alive parameter’s retention period. When this period ends, the search and its results are deleted.

You can also check only the status of the saved synchronous search without results by using get async EQL status API.

You can also manually delete saved synchronous searches using the delete async EQL search API.

Run an EQL search across clusters

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

The EQL search API supports cross-cluster search. However, the local and remote clusters must use the same Elasticsearch version.

The following cluster update settings request adds two remote clusters: cluster_one and cluster_two.

PUT /_cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "cluster_one": {
          "seeds": [
            "127.0.0.1:9300"
          ]
        },
        "cluster_two": {
          "seeds": [
            "127.0.0.1:9301"
          ]
        }
      }
    }
  }
}

To target a data stream or index on a remote cluster, use the <cluster>:<target> syntax.

GET /cluster_one:my-data-stream,cluster_two:my-data-stream/_eql/search
{
  "query": """
    process where process.name == "regsvr32.exe"
  """
}

EQL circuit breaker settings

edit

When a sequence query is executed, the node handling the query needs to keep some structures in memory, which are needed by the algorithm implementing the sequence matching. When large amounts of data need to be processed, and/or a large amount of matched sequences is requested by the user (by setting the size query param), the memory occupied by those structures could potentially exceed the available memory of the JVM. This would cause an OutOfMemory exception which would bring down the node.

To prevent this from happening, a special circuit breaker is used, which limits the memory allocation during the execution of a sequence query. When the breaker is triggered, an org.elasticsearch.common.breaker.CircuitBreakingException is thrown and a descriptive error message is returned to the user.

This circuit breaker can be configured using the following settings:

breaker.eql_sequence.limit
(Dynamic) The limit for circuit breaker used to restrict the memory utilisation during the execution of an EQL sequence query. This value is defined as a percentage of the JVM heap. Defaults to 50%. If the parent circuit breaker is set to a value less than 50%, this setting uses that value as its default instead.
breaker.eql_sequence.overhead
(Dynamic) A constant that sequence query memory estimates are multiplied by to determine a final estimate. Defaults to 1.
breaker.eql_sequence.type

(Static) Circuit breaker type. Valid values are:

memory (Default)
The breaker limits memory usage for EQL sequence queries.
noop
Disables the breaker.