AWS Bedrock

Collect AWS Bedrock model invocation logs and runtime metrics with Elastic Agent.

Version
0.6.0 (View all)
Compatible Kibana version(s)
8.13.0 or higher
Supported Serverless project types

Security
Observability
Subscription level
Basic
Level of support
Elastic

Overview

AWS Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API. You can choose from a wide range of foundation models to find the model that is best suited for your use case. Amazon Bedrock also offers a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top foundation models for your use cases, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources.

The AWS Bedrock integration allows you to easily connect your Bedrock model invocation logging and runtime metrics to Elastic for seamless collection of invocation logs and runtime metrics to monitor usage.

Elastic Security can leverage this data for security analytics including correlation, visualization and incident response. With invocation logging, you can collect the full request and response data, and any metadata associated with use of your account.

IMPORTANT: Extra AWS charges on AWS API requests will be generated by this integration. Please refer to the AWS integration for more details.

Compatibility

This integration is compatible with the AWS Bedrock ModelInvocationLog schema, version 1.0.

Data streams

The AWS Bedrock integration collects model invocation logs and runtime metrics.

Data streams:

  • invocation: Collects invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock.
  • runtime: Collects AWS Bedrock runtime metrics such as model invocation count, invocation latency, input token count, output token count and many more.

Requirements

You need Elasticsearch for storing and searching your data and Kibana for visualizing and managing it. You can use our hosted Elasticsearch Service on Elastic Cloud, which is recommended, or self-manage the Elastic Stack on your own hardware.

Before using any AWS Bedrock integration you will need:

  • AWS Credentials to connect with your AWS account.
  • AWS Permissions to make sure the user you're using to connect has permission to share the relevant data.

For more details about these requirements, please take a look at the AWS integration documentation.

  • Elastic Agent must be installed.
  • You can install only one Elastic Agent per host.
  • Elastic Agent is required to stream data from the S3 bucket and ship the data to Elastic, where the events will then be processed via the integration's ingest pipelines.

Installing and managing an Elastic Agent:

You have a few options for installing and managing an Elastic Agent:

With this approach, you install Elastic Agent and use Fleet in Kibana to define, configure, and manage your agents in a central location. We recommend using Fleet management because it makes the management and upgrade of your agents considerably easier.

Install Elastic Agent in standalone mode (advanced users):

With this approach, you install Elastic Agent and manually configure the agent locally on the system where it is installed. You are responsible for managing and upgrading the agents. This approach is reserved for advanced users only.

Install Elastic Agent in a containerized environment:

You can run Elastic Agent inside a container, either with Fleet Server or standalone. Docker images for all versions of Elastic Agent are available from the Elastic Docker registry, and we provide deployment manifests for running on Kubernetes.

There are some minimum requirements for running Elastic Agent and for more information, refer to the link here.

The minimum kibana.version required is 8.12.0.

Setup

In order to use the AWS Bedrock model invocation logs, logging model invocation logging must be enabled and be sent to a log store destination, either S3 or CloudWatch. The full details of this are available from the AWS Bedrock User Guide, but outlined here.

  1. Set up an Amazon S3 or CloudWatch Logs destination.
  2. Enable logging. This can be done either through the AWS Bedrock console or the AWS Bedrock API.

Collecting Bedrock model invocation logs from S3 bucket

When collecting logs from S3 bucket is enabled, users can retrieve logs from S3 objects that are pointed to by S3 notification events read from an SQS queue or directly polling list of S3 objects in an S3 bucket.

The use of SQS notification is preferred: polling list of S3 objects is expensive in terms of performance and costs and should be preferably used only when no SQS notification can be attached to the S3 buckets. This input integration also supports S3 notification from SNS to SQS.

SQS notification method is enabled setting queue_url configuration value. S3 bucket list polling method is enabled setting bucket_arn configuration value and number_of_workers value. Both queue_url and bucket_arn cannot be set at the same time and at least one of the two value must be set.

Collecting Bedrock model invocation logs from CloudWatch

When collecting logs from CloudWatch is enabled, users can retrieve logs from all log streams in a specific log group. filterLogEvents AWS API is used to list log events from the specified log group.

Exported fields

FieldDescriptionType
@timestamp
Date/time when the event originated. This is the date/time extracted from the event, typically representing when the event was generated by the source. If the event source has no original timestamp, this value is typically populated by the first time the event was received by the pipeline. Required field for all events.
date
aws.cloudwatch.message
CloudWatch log message.
text
aws.s3.bucket.arn
ARN of the S3 bucket that this log retrieved from.
keyword
aws.s3.bucket.name
Name of the S3 bucket that this log retrieved from.
keyword
aws.s3.object.key
Name of the S3 object that this log retrieved from.
keyword
aws_bedrock.invocation.artifacts
flattened
aws_bedrock.invocation.error
keyword
aws_bedrock.invocation.error_code
keyword
aws_bedrock.invocation.image_generation_config.cfg_scale
double
aws_bedrock.invocation.image_generation_config.height
long
aws_bedrock.invocation.image_generation_config.number_of_images
long
aws_bedrock.invocation.image_generation_config.quality
keyword
aws_bedrock.invocation.image_generation_config.seed
long
aws_bedrock.invocation.image_generation_config.width
long
aws_bedrock.invocation.image_variation_params.images
keyword
aws_bedrock.invocation.image_variation_params.text
keyword
aws_bedrock.invocation.images
keyword
aws_bedrock.invocation.input.input_body_json
flattened
aws_bedrock.invocation.input.input_body_json_massive_hash
keyword
aws_bedrock.invocation.input.input_body_json_massive_length
long
aws_bedrock.invocation.input.input_body_s3_path
keyword
aws_bedrock.invocation.input.input_content_type
keyword
aws_bedrock.invocation.input.input_token_count
todo
long
aws_bedrock.invocation.model_id
keyword
aws_bedrock.invocation.output.completion_text
The formatted LLM text model responses. Only a limited number of LLM text models are supported.
text
aws_bedrock.invocation.output.output_body_json
flattened
aws_bedrock.invocation.output.output_body_s3_path
keyword
aws_bedrock.invocation.output.output_content_type
keyword
aws_bedrock.invocation.output.output_token_count
long
aws_bedrock.invocation.request_id
keyword
aws_bedrock.invocation.result
keyword
aws_bedrock.invocation.schema_type
keyword
aws_bedrock.invocation.schema_version
keyword
aws_bedrock.invocation.task_type
keyword
cloud.image.id
Image ID for the cloud instance.
keyword
data_stream.dataset
The field can contain anything that makes sense to signify the source of the data. Examples include nginx.access, prometheus, endpoint etc. For data streams that otherwise fit, but that do not have dataset set we use the value "generic" for the dataset value. event.dataset should have the same value as data_stream.dataset. Beyond the Elasticsearch data stream naming criteria noted above, the dataset value has additional restrictions: * Must not contain - * No longer than 100 characters
constant_keyword
data_stream.namespace
A user defined namespace. Namespaces are useful to allow grouping of data. Many users already organize their indices this way, and the data stream naming scheme now provides this best practice as a default. Many users will populate this field with default. If no value is used, it falls back to default. Beyond the Elasticsearch index naming criteria noted above, namespace value has the additional restrictions: * Must not contain - * No longer than 100 characters
constant_keyword
data_stream.type
An overarching type for the data stream. Currently allowed values are "logs" and "metrics". We expect to also add "traces" and "synthetics" in the near future.
constant_keyword
event.dataset
Event dataset
constant_keyword
event.module
Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), event.module should contain the name of this module.
constant_keyword
gen_ai.analysis.action_recommended
Recommended actions based on the analysis.
keyword
gen_ai.analysis.findings
Detailed findings from security tools.
nested
gen_ai.analysis.function
Name of the security or analysis function used.
keyword
gen_ai.analysis.tool_names
Name of the security or analysis tools used.
keyword
gen_ai.completion
The full text of the LLM's response.
text
gen_ai.compliance.request_triggered
Lists compliance-related filters that were triggered during the processing of the request, such as data privacy filters or regulatory compliance checks.
keyword
gen_ai.compliance.response_triggered
Lists compliance-related filters that were triggered during the processing of the response, such as data privacy filters or regulatory compliance checks.
keyword
gen_ai.compliance.violation_code
Code identifying the specific compliance rule that was violated.
keyword
gen_ai.compliance.violation_detected
Indicates if any compliance violation was detected during the interaction.
boolean
gen_ai.owasp.description
Description of the OWASP risk triggered.
text
gen_ai.owasp.id
Identifier for the OWASP risk addressed.
keyword
gen_ai.performance.request_size
Size of the request payload in bytes.
long
gen_ai.performance.response_size
Size of the response payload in bytes.
long
gen_ai.performance.response_time
Time taken by the LLM to generate a response in milliseconds.
long
gen_ai.performance.start_response_time
Time taken by the LLM to send first response byte in milliseconds.
long
gen_ai.policy.action
Action taken due to a policy violation, such as blocking, alerting, or modifying the content.
keyword
gen_ai.policy.confidence
Confidence level in the policy match that triggered the action, quantifying how closely the identified content matched the policy criteria.
keyword
gen_ai.policy.match_detail.*
object
gen_ai.policy.name
Name of the specific policy that was triggered.
keyword
gen_ai.policy.violation
Specifies if a security policy was violated.
boolean
gen_ai.prompt
The full text of the user's request to the gen_ai.
text
gen_ai.request.id
Unique identifier for the LLM request.
keyword
gen_ai.request.max_tokens
Maximum number of tokens the LLM generates for a request.
integer
gen_ai.request.model.description
Description of the LLM model.
keyword
gen_ai.request.model.id
Unique identifier for the LLM model.
keyword
gen_ai.request.model.instructions
Custom instructions for the LLM model.
text
gen_ai.request.model.role
Role of the LLM model in the interaction.
keyword
gen_ai.request.model.type
Type of LLM model.
keyword
gen_ai.request.model.version
Version of the LLM model used to generate the response.
keyword
gen_ai.request.temperature
Temperature setting for the LLM request.
float
gen_ai.request.timestamp
Timestamp when the request was made.
date
gen_ai.request.top_k
The top_k sampling setting for the LLM request.
float
gen_ai.request.top_p
The top_p sampling setting for the LLM request.
float
gen_ai.response.error_code
Error code returned in the LLM response.
keyword
gen_ai.response.finish_reasons
Reason the LLM response stopped.
keyword
gen_ai.response.id
Unique identifier for the LLM response.
keyword
gen_ai.response.model
Name of the LLM a response was generated from.
keyword
gen_ai.response.timestamp
Timestamp when the response was received.
date
gen_ai.security.hallucination_consistency
Consistency check between multiple responses.
float
gen_ai.security.jailbreak_score
Measures similarity to known jailbreak attempts.
float
gen_ai.security.prompt_injection_score
Measures similarity to known prompt injection attacks.
float
gen_ai.security.refusal_score
Measures similarity to known LLM refusal responses.
float
gen_ai.security.regex_pattern_count
Counts occurrences of strings matching user-defined regex patterns.
integer
gen_ai.sentiment.content_categories
Categories of content identified as sensitive or requiring moderation.
keyword
gen_ai.sentiment.content_inappropriate
Whether the content was flagged as inappropriate or sensitive.
boolean
gen_ai.sentiment.score
Sentiment analysis score.
float
gen_ai.sentiment.toxicity_score
Toxicity analysis score.
float
gen_ai.system
Name of the LLM foundation model vendor.
keyword
gen_ai.text.complexity_score
Evaluates the complexity of the text.
float
gen_ai.text.readability_score
Measures the readability level of the text.
float
gen_ai.text.similarity_score
Measures the similarity between the prompt and response.
float
gen_ai.threat.action
Recommended action to mitigate the detected security threat.
keyword
gen_ai.threat.category
Category of the detected security threat.
keyword
gen_ai.threat.description
Description of the detected security threat.
text
gen_ai.threat.detected
Whether a security threat was detected.
boolean
gen_ai.threat.risk_score
Numerical score indicating the potential risk associated with the response.
float
gen_ai.threat.signature
Signature of the detected security threat.
keyword
gen_ai.threat.source
Source of the detected security threat.
keyword
gen_ai.threat.type
Type of threat detected in the LLM interaction.
keyword
gen_ai.threat.yara_matches
Stores results from YARA scans including rule matches and categories.
nested
gen_ai.usage.completion_tokens
Number of tokens in the LLM's response.
integer
gen_ai.usage.prompt_tokens
Number of tokens in the user's request.
integer
gen_ai.user.id
Unique identifier for the user.
keyword
gen_ai.user.rn
Unique resource name for the user.
keyword
host.containerized
If the host is a container.
boolean
host.os.build
OS build information.
keyword
host.os.codename
OS codename, if any.
keyword
input.type
Type of Filebeat input.
keyword
log.offset
Log offset
long

Metrics

Runtime Metrics

AWS Bedrock runtime metrics include Invocations, InvocationLatency, InvocationClientErrors, InvocationServerErrors, OutputTokenCount, OutputImageCount, InvocationThrottles. These metrics can be used for variety of use cases including

  • Comparing model latency.
  • Measuring input and output token counts.
  • Detecting the number of invocations that the system throttled.

An example event for runtime looks as following:

{
    "@timestamp": "2024-07-15T07:35:00.000Z",
    "agent": {
        "ephemeral_id": "63673811-d18c-4209-8818-df8b346bcb28",
        "id": "47a2173f-3f59-4a7c-a022-dee86802c2c1",
        "name": "service-integration-dev-idc-1",
        "type": "metricbeat",
        "version": "8.13.4"
    },
    "aws": {
        "cloudwatch": {
            "namespace": "AWS/Bedrock"
        }
    },
    "aws_bedrock": {
        "runtime": {
            "input_token_count": 848,
            "invocation_latency": 2757,
            "invocations": 5,
            "output_token_count": 1775
        }
    },
    "cloud": {
        "account": {
            "id": "00000000000000",
            "name": "MonitoringAccount"
        },
        "provider": "aws",
        "region": "ap-south-1"
    },
    "data_stream": {
        "dataset": "aws_bedrock.runtime",
        "namespace": "ep",
        "type": "metrics"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "elastic_agent": {
        "id": "47a2173f-3f59-4a7c-a022-dee86802c2c1",
        "snapshot": false,
        "version": "8.13.4"
    },
    "event": {
        "agent_id_status": "verified",
        "dataset": "aws_bedrock.runtime",
        "duration": 174434808,
        "ingested": "2024-07-15T07:44:02Z",
        "module": "aws"
    },
    "host": {
        "architecture": "x86_64",
        "containerized": false,
        "hostname": "service-integration-dev-idc-1",
        "id": "1bfc9b2d8959f75a520a3cb94cf035c8",
        "ip": [
            "10.160.0.4",
            "172.1.0.1",
            "172.17.0.1",
            "172.19.0.1",
            "172.20.0.1",
            "172.22.0.1",
            "172.23.0.1",
            "172.26.0.1",
            "172.27.0.1",
            "172.28.0.1",
            "172.29.0.1",
            "172.30.0.1",
            "172.31.0.1",
            "192.168.0.1",
            "192.168.32.1",
            "192.168.49.1",
            "192.168.80.1",
            "192.168.224.1",
            "fe80::42:9cff:fe5b:79b4",
            "fe80::42:a5ff:fe15:d63c",
            "fe80::42:beff:fe39:f457",
            "fe80::42a:f7ff:fe6c:421d",
            "fe80::1818:53ff:fea8:3f38",
            "fe80::4001:aff:fea0:4",
            "fe80::8cfa:3aff:fedb:656a",
            "fe80::c890:29ff:fe99:ac1b",
            "fe80::fcfc:c2ff:feca:1e28"
        ],
        "mac": [
            "02-42-0D-A6-43-C0",
            "02-42-23-32-CF-25",
            "02-42-27-90-E6-54",
            "02-42-34-10-CA-62",
            "02-42-4F-1D-94-1B",
            "02-42-50-2E-CB-58",
            "02-42-5D-42-F3-1D",
            "02-42-66-9B-25-B2",
            "02-42-99-B7-1B-26",
            "02-42-9C-5B-79-B4",
            "02-42-A5-15-D6-3C",
            "02-42-A6-68-F8-E9",
            "02-42-BE-39-F4-57",
            "02-42-CE-31-B7-A3",
            "02-42-E8-F3-CF-7A",
            "02-42-F1-35-B0-41",
            "02-42-F4-2F-0F-22",
            "06-2A-F7-6C-42-1D",
            "1A-18-53-A8-3F-38",
            "42-01-0A-A0-00-04",
            "8E-FA-3A-DB-65-6A",
            "CA-90-29-99-AC-1B",
            "FE-FC-C2-CA-1E-28"
        ],
        "name": "service-integration-dev-idc-1",
        "os": {
            "codename": "bionic",
            "family": "debian",
            "kernel": "5.4.0-1106-gcp",
            "name": "Ubuntu",
            "platform": "ubuntu",
            "type": "linux",
            "version": "18.04.6 LTS (Bionic Beaver)"
        }
    },
    "metricset": {
        "name": "cloudwatch",
        "period": 300000
    },
    "service": {
        "type": "aws"
    }
}

Exported fields

FieldDescriptionTypeUnitMetric Type
@timestamp
Event timestamp.
date
agent.id
Unique identifier of this agent (if one exists). Example: For Beats this would be beat.id.
keyword
aws.cloudwatch.namespace
The namespace specified when query cloudwatch api.
keyword
aws_bedrock.runtime.bucketed_step_size
keyword
aws_bedrock.runtime.image_size
keyword
aws_bedrock.runtime.input_token_count
The number of text input tokens.
long
gauge
aws_bedrock.runtime.invocation_client_errors
The number of invocations that result in client-side errors.
long
gauge
aws_bedrock.runtime.invocation_latency
The average latency of the invocations.
long
ms
gauge
aws_bedrock.runtime.invocation_server_errors
The number of invocations that result in AWS server-side errors.
long
gauge
aws_bedrock.runtime.invocation_throttles
The number of invocations that the system throttled.
long
gauge
aws_bedrock.runtime.invocations
The number of requests to the Converse, ConverseStream, InvokeModel, and InvokeModelWithResponseStream API operations.
long
gauge
aws_bedrock.runtime.legacymodel_invocations
The number of requests to the legacy models.
long
gauge
aws_bedrock.runtime.model_id
keyword
aws_bedrock.runtime.output_image_count
The number of output images.
long
gauge
aws_bedrock.runtime.output_token_count
The number of text output tokens.
long
gauge
aws_bedrock.runtime.quality
keyword
cloud.account.id
The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.
keyword
cloud.region
Region in which this host, resource, or service is located.
keyword
data_stream.dataset
Data stream dataset.
constant_keyword
data_stream.namespace
Data stream namespace.
constant_keyword
data_stream.type
Data stream type.
constant_keyword
event.module
Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), event.module should contain the name of this module.
constant_keyword

Changelog

VersionDetailsKibana version(s)

0.6.0

Enhancement View pull request
Add new field aws_bedrock.invocation.output.completion_text having LLM text model response. Add visualization for LLM prompt and response.

0.5.0

Enhancement View pull request
Add processor to set cloud.account.name field for aws_bedrock runtime data stream.

0.4.0

Enhancement View pull request
Add dot_expander processor into metrics ingest pipeline.

0.3.0

Enhancement View pull request
Add runtime dataset for collecting runtime metrics.

0.2.0

Enhancement View pull request
Update the kibana constraint to ^8.13.0. Modified the field definitions to remove ECS fields made redundant by the ecs@mappings component template.

0.1.3

Bug fix View pull request
Fix name canonicalization routines.

0.1.2

Bug fix View pull request
Add documentation image.

0.1.1

Bug fix View pull request
Fix documentation markdown.

0.1.0

Enhancement View pull request
Initial build.

On this page