Watching Time Series Data

edit

If you are indexing time-series data such as logs, RSS feeds, or network traffic, you can use watcher to send notifications when certain events occur.

For example, you could index an RSS feed of posts on Stack Overflow that are tagged with Elasticsearch, Logstash, or Kibana, set up a watch to check daily for new posts about a problem or failure, and send an email if any are found.

The simplest way to index an RSS feed is to use Logstash.

To install Logstash and set up the RSS input plugin:

  1. Download Logstash 1.5.0 RC4+ and unpack the archive file.
  2. Go to the logstash-<logstash_version> directory and install the RSS input plugin:

    cd logstash-<logstash_version>
    bin/plugin install logstash-input-rss
  3. Create a Logstash configuration file that uses the RSS input plugin to get data from an RSS/atom feed and outputs the data to Elasticsearch. For example, the following rss.conf file gets events from the Stack Overflow feed that are tagged with elasticsearch, logstash, or kibana.

    input {
      rss {
        url => "http://stackoverflow.com/feeds/tag/elasticsearch+or+logstash+or+kibana"
        interval => 3600 
      }
    }
    
    output {
      elasticsearch { }
      stdout { }
    }

    Checks the feed every hour.

    For more information see Elasticsearch output in the Logstash Reference.

  4. Run Logstash with the rss.conf config file to start indexing the feed:

    bin/logstash -f rss.conf

Once you have Logstash set up to input data from the RSS feed into Elasticsearch, you can set up a watch that runs at noon each day to check for new posts that contain the words "error" or "problem".

To set up the watch, define a trigger, input, condition, and an action:

  1. Define the watch trigger—​a daily schedule that runs at 12:00 UTC time every day:

    "trigger" : {
      "schedule" : {
        "daily" : { "at" : "12:00" }
      }
    }

    In Watcher, you specify times in UTC time. Don’t forget to do the conversion from your local time so the schedule triggers at the time you intend.

  2. Define the watch input—​a search that uses a filter to constrain the results to the past day.

    "input" : {
      "search" : {
        "request" : {
          "indices" : [ "logstash*" ],
          "body" : {
            "query" : {
              "filtered" : {
                "query" : {"match" : {"message": "error problem"}},
                "filter" : {
                  "range" : {"@timestamp" : {"gte" : "now-1d"}}
                }
              }
            }
          }
        }
      }
    }
  3. Define a watch condition to check the payload to see if the input search returned any hits. If it did, the condition resolves to true and the watch actions will be performed.

    You define the condition with the following script:

    return ctx.payload.hits.total > threshold

    If you store the script in a file at $ES_HOME/config/scripts/threshold_hits.groovy, you can then reference it by name in the watch condition. Using file-based Groovy scripts enables you to avoid using dynamic scripting. For more information see Running Groovy Scripts without Dynamic Scripting.

    "condition" : {
        "script" : {
          "file" : "threshold_hits",
          "params" : {
            "threshold" : 0 
          }
        }
      },

    The threshold parameter value you want to pass to the script.

    We recommend using file scripts when possible. To use inline or indexed scripts, you must enable dynamic scripting in Elasticsearch.

  4. Define a watch action to send an email that contains the relevant messages from the past day as an attachment.

    "actions" : {
      "send_email" : {
        "email" : {
          "to" : "<username>@<domainname>",
          "subject" : "Somebody needs help with ELK",
          "body" : "The attached Stack Overflow posts were tagged with Elasticsearch, Logstash, or Kibana and mentioned an error or problem.",
          "attach_data" : true
        }
      }
    }

    To use the email action, you must configure at least one email account in elasticsearch.yml. If you configure multiple email accounts, you need to specify which one you want to use in the email action. For more information, see Working with Various Email Services.

The complete watch looks like this:

PUT _watcher/watch/rss_watch
{
  "trigger" : {
      "schedule" : {
        "daily" : { "at" : "12:00" }
    }
  },
  "input" : {
    "search" : {
      "request" : {
        "indices" : [ "logstash*" ],
        "body" : {
          "query" : {
            "filtered" : {
              "query" : {"match" : {"message": "error problem"}},
              "filter" : {"range" : {"@timestamp" : {"gte" : "now-1d"}}}
            }
          }
        }
      }
    }
  },
  "condition" : {
    "script" : {
      "file" : "threshold_hits",
      "params" : {
        "threshold" : 0
      }
    }
  },
  "actions" : {
    "send_email" : {
      "email" : {
        "to" : "<username>@<domainname>",  
        "subject" : "Somebody needs help with ELK",
        "body" : "The attached Stack Overflow posts were tagged with Elasticsearch, Logstash, or Kibana and mentioned an error or problem.",
        "attach_data" : true
      }
    }
  }
}

Replace <username>@<domainname> with your email address to receive notifications.

To execute a watch immediately (without waiting for the schedule to trigger), use the _execute API:

POST _watcher/watch/rss_watch/_execute
{
  "ignore_condition" : true,
  "action_modes" : {
    "_all" : "force_execute"
  },
  "record_execution" : true
}