WARNING: Version 5.4 of the Elastic Stack has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Sum Functions
editSum Functions
editThe sum functions detect anomalies when the sum of a field in a bucket is anomalous.
If you want to monitor unusually high totals, use high-sided functions.
If want to look at drops in totals, use low-sided functions.
If your data is sparse, use non_null_sum
functions. Buckets without values are
ignored; buckets with a zero value are analyzed.
The X-Pack machine learning features include the following sum functions:
Sum, High_sum, Low_sum
editThe sum
function detects anomalies where the sum of a field in a bucket is
anomalous.
If you want to monitor unusually high sum values, use the high_sum
function.
If you want to monitor unusually low sum values, use the low_sum
function.
These functions support the following properties:
-
field_name
(required) -
by_field_name
(optional) -
over_field_name
(optional) -
partition_field_name
(optional)
For more information about those properties, see Detector Configuration Objects.
Example 1: Analyzing total expenses with the sum function.
{ "function" : "sum", "field_name" : "expenses", "by_field_name" : "costcenter", "over_field_name" : "employee" }
If you use this sum
function in a detector in your job, it
models total expenses per employees for each cost center. For each time bucket,
it detects when an employee’s expenses are unusual for a cost center compared
to other employees.
Example 2: Analyzing total bytes with the high_sum function.
{ "function" : "high_sum", "field_name" : "cs_bytes", "over_field_name" : "cs_host" }
If you use this high_sum
function in a detector in your job, it
models total cs_bytes
. It detects cs_hosts
that transfer unusually high
volumes compared to other cs_hosts
. This example looks for volumes of data
transferred from a client to a server on the internet that are unusual compared
to other clients. This scenario could be useful to detect data exfiltration or
to find users that are abusing internet privileges.
Non_null_sum, High_non_null_sum, Low_non_null_sum
editThe non_null_sum
function is useful if your data is sparse. Buckets without
values are ignored and buckets with a zero value are analyzed.
If you want to monitor unusually high totals, use the high_non_null_sum
function.
If you want to look at drops in totals, use the low_non_null_sum
function.
These functions support the following properties:
-
field_name
(required) -
by_field_name
(optional) -
partition_field_name
(optional)
For more information about those properties, see Detector Configuration Objects.
Population analysis (that is to say, use of the over_field_name
property)
is not applicable for this function.
Example 3: Analyzing employee approvals with the high_non_null_sum function.
{ "function" : "high_non_null_sum", "fieldName" : "amount_approved", "byFieldName" : "employee" }
If you use this high_non_null_sum
function in a detector in your job, it
models the total amount_approved
for each employee. It ignores any buckets
where the amount is null. It detects employees who approve unusually high
amounts compared to their past behavior.