Spike in AWS Error Messages

edit

A machine learning job detected a significant spike in the rate of a particular error in the CloudTrail messages. Spikes in error messages may accompany attempts at privilege escalation, lateral movement, or discovery.

Rule type: machine_learning

Machine learning job: high_distinct_count_error_message

Machine learning anomaly threshold: 50

Severity: low

Risk score: 21

Runs every: 15 minutes

Searches indices from: now-60m (Date Math format, see also Additional look-back time)

Maximum alerts per execution: 100

References:

Tags:

  • Elastic
  • Cloud
  • AWS
  • ML

Version: 2 (version history)

Added (Elastic Stack release): 7.9.0

Last modified (Elastic Stack release): 7.10.0

Rule authors: Elastic

Rule license: Elastic License

Potential false positives

edit

Spikes in error message activity can also be due to bugs in cloud automation scripts or workflows, changes to cloud automation scripts or workflows, adoption of new services, changes in the way services are used, or changes to IAM privileges.

Investigation guide

edit

Alerts from this rule indicate a large spike in the number of CloudTrail log messages that contain a particular error message. The error message in question is associated with the response to an AWS API command or method call. Here are some possible avenues of investigation:

  • Examine the history of the error. Has it manifested before? If the error, which is visible in the aws.cloudtrail.error_message field, manifested only very recently, it might be related to recent changes in an automation module or script.
  • Examine the request parameters. These may provide indications as to the nature of the task being performed when the error occurred. Is the error related to unsuccessful attempts to enumerate or access objects, data or secrets? If so, this can sometimes be a byproduct of discovery, privilege escalation or lateral movement attempts.
  • Consider the user as identified by the user.name field. Is this activity part of an expected workflow for the user context? Examine the user identity in the aws.cloudtrail.user_identity.arn field and the access key id in the aws.cloudtrail.user_identity.access_key_id field which can help identify the precise user context. The user agent details in the user_agent.original field may also indicate what type of client made the request.
  • Consider the source IP address and geolocation for the calling user who issued the command. Do they look normal for the calling user? If the source is an EC2 IP address, is it associated with an EC2 instance in one of your accounts or could it be sourcing from an EC2 instance not under your control? If it is an authorized EC2 instance, is the activity associated with normal behavior for the instance role or roles? Are there any other alerts or signs of suspicious activity involving this instance?

Rule version history

edit
Version 2 (7.10.0 release)
  • Formatting only