Custom Azure Logs

edit

Version

0.1.0 [beta] This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features. (View all)

Compatible Kibana version(s)

8.13.0 or higher

Supported Serverless project types
What’s this?

Security
Observability

Subscription level
What’s this?

Basic

Level of support
What’s this?

Elastic

The Custom Azure Logs integration collects logs from Azure Event Hubs.

Use the integration to collect logs from:

  • Azure services that support exporting logs to Event Hubs
  • Any other source that can send logs to an Event Hubs

Data streams

edit

The Custom Azure Logs integration only supports logs data streams.

This custom integration does not use a predefined Elastic data stream like standard integrations do (for example, logs-azure.activitylogs-default for Activity logs). You can take control and build your own data stream by selecting your dataset and namespace of choice when configuring the integration.

For example, if you select mydataset as your dataset, and default as your namespace, the integration will send the data to the logs-mydataset-default data stream.

The integration sets up a dedicated index template named logs-mydataset with the logs-mydataset-* index pattern. You can then customize it using a custom pipeline and custom mappings.

Custom Logs integrations give you all the flexibility you need to configure the integration to your needs.

Requirements

edit

You need Elasticsearch to store and search for your data and Kibana to visualize and manage it. You can use our recommended hosted Elasticsearch Service on Elastic Cloud or self-manage the Elastic Stack on your own hardware.

Before using the Custom Azure Logs, you will need:

  • One event hub to store in-flight logs exported by Azure services (or other sources) and make them available to Elastic Agent.
  • A Storage Account to store checkpoint information about logs the Elastic Agent consumes.
Event hub
edit

Azure Event Hubs is a data streaming platform and event ingestion service that can receive and temporarily store millions of events.

Elastic Agent with the Custom Azure Logs integration will consume logs from the Event Hubs service.

  ┌────────────────┐      ┌───────────┐
  │   myeventhub   │      │  Elastic  │
  │ <<Event Hub>>  │─────▶│   Agent   │
  └────────────────┘      └───────────┘

To learn more about Event Hubs, refer to Features and terminology in Azure Event Hubs.

Storage Account Container
edit

The Storage Account is a versatile Azure service that allows you to store data in various storage types, including blobs, file shares, queues, tables, and disks.

The Custom Azure Logs integration requires a Storage Account container to work.

The integration uses the Storage Account container for checkpointing. It stores data about the Consumer Group (state, position, or offset) and shares it among the Elastic Agents. Sharing such information allows multiple Elastic Agents assigned to the same agent policy to work together, enabling horizontal scaling of the logs processing when required.

  ┌────────────────┐                     ┌───────────┐
  │   myeventhub   │        logs         │  Elastic  │
  │ <<event hub>>  │────────────────────▶│   Agent   │
  └────────────────┘                     └───────────┘
                                                │
                       consumer group info      │
  ┌────────────────┐   (state, position, or     │
  │ log-myeventhub │         offset)            │
  │ <<container>>  │◀───────────────────────────┘
  └────────────────┘

The Elastic Agent automatically creates one container for the Custom Azure Logs integration and one blob for each partition on the event hub.

For example, if the integration is configured to fetch data from an event hub with four partitions, the Agent will create the following:

  • One Storage Account container.
  • Four blobs in that container.

The information stored in the blobs is small (usually < 500 bytes per blob) and accessed frequently. Elastic recommends using the Hot storage tier.

You need to keep the Storage Account container as long as you need to run the integration with the Elastic Agent. If you delete a Storage Account container, the Elastic Agent will stop working and create a new one the next time it starts.

By deleting a Storage Account container, the Elastic Agent will lose track of the last message processed and start processing messages from the beginning of the event hub retention period.

Setup

edit

Before adding the integration, complete the following tasks.

Create an event hub
edit

The event hub receives the logs exported from the Azure service and makes them available for the Elastic Agent to read.

Here’s a high-level overview of the required steps:

  • Create a resource group, or select an existing one.
  • Create an Event Hubs namespace.
  • Create an event hub.

For a step-by-step guide, check the quickstart Create an event hub using Azure portal.

Take note of the event hub Name, which you will use later when specifying an eventhub in the integration settings.

Event Hubs namespace vs event hub
edit

In the integration settings, you should use the event hub name (not the Event Hubs namespace name) as the value for the event hub option.

If you are new to Event Hubs, think of the Event Hubs namespace as the cluster and the event hub as the topic. You will typically have one cluster and multiple topics.

If you are familiar with Kafka, here’s a conceptual mapping between the two:

Kafka Concept Event Hubs Concept

Cluster

Namespace

Topic

Event hub

Partition

Partition

Consumer Group

Consumer group

Offset

Offset

How many partitions?
edit

The number of partitions is essential to balance the event hub cost and performance.

Here are a few examples with one or multiple agents, with recommendations on picking the correct number of partitions for your use case.

Single Agentedit

With a single Agent deployment, increasing the number of partitions on the event hub is the primary driver in scale-up performances. The Agent creates one worker for each partition.

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐

│                         │    │                         │

│   ┌─────────────────┐   │    │   ┌─────────────────┐   │
    │   partition 0   │◀───────────│     worker      │
│   └─────────────────┘   │    │   └─────────────────┘   │
    ┌─────────────────┐            ┌─────────────────┐
│   │   partition 1   │◀──┼────┼───│     worker      │   │
    └─────────────────┘            └─────────────────┘
│   ┌─────────────────┐   │    │   ┌─────────────────┐   │
    │   partition 2   │◀────────── │     worker      │
│   └─────────────────┘   │    │   └─────────────────┘   │
    ┌─────────────────┐            ┌─────────────────┐
│   │   partition 3   │◀──┼────┼───│     worker      │   │
    └─────────────────┘            └─────────────────┘
│                         │    │                         │

│                         │    │                         │

└ Event hub ─ ─ ─ ─ ─ ─ ─ ┘    └ Elastic Agent ─ ─ ─ ─ ─ ┘
Two or more Elastic Agentsedit

With more than one Elastic Agent, setting the number of partitions is crucial. The agents share the existing partitions to scale out performance and improve availability.

The number of partitions must be at least the number of agents.

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐

│                         │    │   ┌─────────────────┐   │
                            ┌──────│     worker      │
│   ┌─────────────────┐   │ │  │   └─────────────────┘   │
    │   partition 0   │◀────┘      ┌─────────────────┐
│   └─────────────────┘   │ ┌──┼───│     worker      │   │
    ┌─────────────────┐     │      └─────────────────┘
│   │   partition 1   │◀──┼─┘  │                         │
    └─────────────────┘         ─Agent─ ─ ─ ─ ─ ─ ─ ─ ─ ─
│   ┌─────────────────┐   │    ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
    │   partition 2   │◀────┐
│   └─────────────────┘   │ │  │  ┌─────────────────┐    │
    ┌─────────────────┐     └─────│     worker      │
│   │   partition 3   │◀──┼─┐  │  └─────────────────┘    │
    └─────────────────┘     │     ┌─────────────────┐
│                         │ └──┼──│     worker      │    │
                                  └─────────────────┘
│                         │    │                         │

└ Event hub ─ ─ ─ ─ ─ ─ ─ ┘    └ Elastic Agent ─ ─ ─ ─ ─ ┘
Recommendationsedit

Create an event hub with at least two partitions. Two partitions allow low-volume deployment to support high availability with two agents. Consider creating four partitions or more to handle medium-volume deployments with availability.

To learn more about event hub partitions, check this guide from Microsoft at https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-create.

To learn more about event hub partition from the performance perspective, check the scalability-focused document at https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-scalability#partitions.

Consumer group
edit

Like all other event hub clients, Elastic Agent needs a consumer group name to access the event hub.

A Consumer Group is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple agents to have a separate view of the event stream and to read the logs independently at their own pace and with their offsets.

Consumer groups allow multiple Elastic Agents assigned to the same agent policy to work together, enabling horizontal scaling of log processing when required.

In most cases, you can use the default consumer group named $Default. If $Default is already used by other applications, you can create a consumer group dedicated to the Azure Logs integration.

Connection string
edit

The Elastic Agent requires a connection string to access the event hub and fetch the exported logs. The connection string contains details about the event hub used and the credentials required to access it.

To get the connection string for your Event Hubs namespace:

  1. Visit the Event Hubs namespace you created in a previous step.
  2. Select Settings > Shared access policies.

Create a new Shared Access Policy (SAS):

  1. Select Add to open the creation panel.
  2. Add a Policy name (for example, "ElasticAgent").
  3. Select the Listen claim.
  4. Click Create.

When the SAS Policy is ready, select it to display the information panel.

Take note of the Connection string–primary key, which you will use later when specifying a connection_string in the integration settings.

Create a diagnostic settings
edit

The diagnostic settings export the logs from Azure services to a destination, and to use Azure Logs integration, it must be an event hub.

To create a diagnostic settings to export logs:

  1. Locate the diagnostic settings for the service (for example, Microsoft Entra ID).
  2. Select diagnostic settings in the Monitoring section of the service. Note that different services might place the diagnostic settings in various positions.
  3. Select Add diagnostic settings.

In the diagnostic settings page, you must select the source log categories you want to export and then select their destination.

Select log categories
edit

Each Azure service exports a well-defined list of log categories. Check the individual integration documentation to check the supported log categories.

Select the destination
edit

Select the subscription and the Event Hubs namespace you previously created. Select the event hub dedicated to this integration.

  ┌───────────────┐   ┌──────────────┐   ┌───────────────┐      ┌───────────┐
  │  MS Entra ID  │   │  Diagnostic  │   │     adlogs    │      │  Elastic  │
  │  <<service>>  ├──▶│   Settings   │──▶│ <<event hub>> │─────▶│   Agent   │
  └───────────────┘   └──────────────┘   └───────────────┘      └───────────┘
Create a Storage Account Container
edit

The Elastic Agent stores the consumer group information (state, position, or offset) in a Storage Account container. Making this information available to all agents allows them to share the logs processing and resume from the last processed logs after a restart.

Use the Storage Account as a checkpoint store only.

To create the Storage Account:

  1. Sign in to the Azure Portal and create your Storage Account.
  2. While configuring your project details, make sure you select the following recommended default settings:

    • Hierarchical namespace: disabled
    • Minimum TLS version: Version 1.2
    • Access tier: Hot
    • Enable soft delete for blobs: disabled
    • Enable soft delete for containers: disabled
  3. When the new Storage Account is ready, take note of the Storage Account name and access keys, as you will use them later to authenticate your Elastic application’s requests to this Storage Account.

This is the final diagram of the setup for collecting Activity logs from the Azure Monitor service.

 ┌───────────────┐   ┌──────────────┐   ┌────────────────┐         ┌───────────┐
 │  MS Entra ID  │   │  Diagnostic  │   │     adlogs     │  logs   │  Elastic  │
 │  <<service>>  ├──▶│   Settings   │──▶│ <<event hub>>  │────────▶│   Agent   │
 └───────────────┘   └──────────────┘   └────────────────┘         └───────────┘
                                                                          │
                     ┌──────────────┐          consumer group info        │
                     │  azurelogs   │          (state, position, or       │
                     │<<container>> │◀───────────────offset)──────────────┘
                     └──────────────┘
Storage Account containers?
edit

The Elastic Agent can use one Storage Account (SA) for multiple integrations.

The Agent creates one SA container for the integration. The SA container name combines the event hub name and a prefix (azure-eventhub-input-[eventhub]).

Running the integration behind a firewall
edit

When you run the Elastic Agent behind a firewall, you must allow traffic on ports 5671 and 5672 for the event hub and port 443 for the Storage Account container to ensure proper communication with the necessary components.

┌────────────────────────────────┐  ┌───────────────────┐  ┌───────────────────┐
│                                │  │                   │  │                   │
│ ┌────────────┐   ┌───────────┐ │  │  ┌──────────────┐ │  │ ┌───────────────┐ │
│ │ diagnostic │   │ event hub │ │  │  │azure-eventhub│ │  │ │ activity logs │ │
│ │  setting   │──▶│           │◀┼AMQP─│  <<input>>   │─┼──┼▶│<<data stream>>│ │
│ └────────────┘   └───────────┘ │  │  └──────────────┘ │  │ └───────────────┘ │
│                                │  │          │        │  │                   │
│                                │  │          │        │  │                   │
│                                │  │          │        │  │                   │
│         ┌─────────────┬─────HTTPS─┼──────────┘        │  │                   │
│ ┌───────┼─────────────┼──────┐ │  │                   │  │                   │
│ │       │             │      │ │  │                   │  │                   │
│ │       ▼             ▼      │ │  └─Agent─────────────┘  └─Elastic Cloud─────┘
│ │ ┌──────────┐  ┌──────────┐ │ │
│ │ │    0     │  │    1     │ │ │
│ │ │ <<blob>> │  │ <<blob>> │ │ │
│ │ └──────────┘  └──────────┘ │ │
│ │                            │ │
│ │                            │ │
│ └─Storage Account Container──┘ │
│                                │
│                                │
└─Azure──────────────────────────┘
Event hub
edit

Port 5671 and 5672 are commonly used for secure communication with the event hub. These ports are used to receive events. The Elastic Agent can establish a secure connection with the event hub by allowing traffic on these ports.

For more information, check the following documents:

Storage Account container
edit

Port 443 is used for secure communication with the Storage Account container. This port is commonly used for HTTPS traffic. By allowing traffic on port 443, the Elastic Agent can securely access and interact with the Storage Account container, essential for storing and retrieving checkpoint data for each event hub partition.

DNS
edit

Optionally, you can restrict the traffic to the following domain names:

*.servicebus.windows.net
*.blob.core.windows.net
*.cloudapp.net

Settings

edit

Use the following settings to configure the Azure Logs integration when you add it to Fleet.

eventhub : string A fully managed, real-time data ingestion service. Elastic recommends using only letters, numbers, and the hyphen (-) character for event hub names to maximize compatibility. You can use existing event hubs having underscores (_) in the event hub name; in this case, the integration will replace underscores with hyphens (-) when it uses the event hub name to create dependent Azure resources behind the scenes (e.g., the Storage Account container to store event hub consumer offsets). Elastic also recommends using a separate event hub for each log type as the field mappings of each log type differ. Default value insights-operational-logs.

consumer_group : string Enable the publish/subscribe mechanism of Event Hubs with consumer groups. A consumer group is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple consuming applications to each have a separate view of the event stream, and to read the stream independently at their own pace and with their own offsets. Default value: $Default

connection_string : string

The connection string is required to communicate with Event Hubs. Check Get an Event Hubs connection string for more information.

A Blob Storage Account is required to store/retrieve/update the checkpoint information of the event hub messages. This allows the integration to resume processing messages left when the user stops it.

storage_account : string The name of the Storage Account that stores the checkpoint information. storage_account_key : string The Storage Account key. Key to authorize access to data in your Storage Account.

storage_account_container : string The Storage Account container is where the integration stores the checkpoint data for the consumer group. It is an advanced option to use with extreme care. You MUST use a dedicated Storage Account container for each Azure log type (activity, sign-in, audit logs, and others). DO NOT REUSE the same container name for more than one Azure log type. Check Container Names for details on naming rules from Microsoft. The integration generates a default container name if not specified.

pipeline : string Optional. Overrides the default ingest pipeline for this integration.

resource_manager_endpoint : string Optional. By default, the integration uses the Azure public environment. To override this and use a different Azure environment, users can provide a specific resource manager endpoint.

Examples:

  • Azure ChinaCloud: https://management.chinacloudapi.cn/
  • Azure GermanCloud: https://management.microsoftazure.de/
  • Azure PublicCloud: https://management.azure.com/
  • Azure USGovernmentCloud: https://management.usgovcloudapi.net/

This setting can also define your endpoints, like for hybrid cloud models.

Handling Malformed JSON in Azure Logs

edit

Azure services have been observed occasionally sending malformed JSON documents. These logs can disrupt the expected JSON formatting and lead to parsing issues during processing.

To address this issue, the advanced settings section of each data stream offers two sanitization options:

  • Sanitizes New Lines: removes new lines in logs.
  • Sanitizes Single Quotes: Replace single quotes with double quotes in logs, excluding single quotes occurring within double quotes.

Malformed logs can be identified by:

  • The presence of a records array in the message field indicates a failure to unmarshal the byte slice.
  • Existence of an error.message field containing the text "Received invalid JSON from the Azure Cloud platform. Unable to parse the source log message."

Known data streams that might produce malformed logs:

  • Platform Logs
  • Spring Apps Logs
  • PostgreSQL Flexible Servers Logs

Changelog

edit
Changelog
Version Details Kibana version(s)

0.1.0

Enhancement (View pull request)
Add Custom Azure Logs to collect log events from Azure Event Hubs