- Elastic integrations
- Integrations quick reference
- 1Password
- Abnormal Security
- ActiveMQ
- Active Directory Entity Analytics
- Admin By Request EPM integration
- Airflow
- Akamai
- Apache
- API (custom)
- Arbor Peakflow SP Logs
- Arista NG Firewall
- Atlassian
- Auditd
- Auth0
- authentik
- AWS
- Amazon CloudFront
- Amazon DynamoDB
- Amazon EBS
- Amazon EC2
- Amazon ECS
- Amazon EMR
- AWS API Gateway
- Amazon GuardDuty
- AWS Health
- Amazon Kinesis Data Firehose
- Amazon Kinesis Data Stream
- Amazon MQ
- Amazon Managed Streaming for Apache Kafka (MSK)
- Amazon NAT Gateway
- Amazon RDS
- Amazon Redshift
- Amazon S3
- Amazon S3 Storage Lens
- Amazon Security Lake
- Amazon SNS
- Amazon SQS
- Amazon VPC
- Amazon VPN
- AWS Bedrock
- AWS Billing
- AWS CloudTrail
- AWS CloudWatch
- AWS ELB
- AWS Fargate
- AWS Inspector
- AWS Lambda
- AWS Logs (custom)
- AWS Network Firewall
- AWS Route 53
- AWS Security Hub
- AWS Transit Gateway
- AWS Usage
- AWS WAF
- Azure
- Activity logs
- App Service
- Application Gateway
- Application Insights metrics
- Application Insights metrics overview
- Application State Insights metrics
- Azure logs (v2 preview)
- Azure OpenAI
- Billing metrics
- Container instance metrics
- Container registry metrics
- Container service metrics
- Custom Azure Logs
- Custom Blob Storage Input
- Database Account metrics
- Event Hub input
- Firewall logs
- Frontdoor
- Functions
- Microsoft Entra ID
- Monitor metrics
- Network Watcher VNet
- Network Watcher NSG
- Platform logs
- Resource metrics
- Spring Cloud logs
- Storage Account metrics
- Virtual machines metrics
- Virtual machines scaleset metrics
- Barracuda
- BeyondInsight and Password Safe Integration
- BitDefender
- Bitwarden
- blacklens.io
- Blue Coat Director Logs
- BBOT (Bighuge BLS OSINT Tool)
- Box Events
- Bravura Monitor
- Broadcom ProxySG
- Canva
- Cassandra
- CEL Custom API
- Ceph
- Check Point
- Cilium Tetragon
- CISA Known Exploited Vulnerabilities
- Cisco
- Cisco Meraki Metrics
- Citrix
- Claroty CTD
- Cloudflare
- Cloud Asset Inventory
- CockroachDB Metrics
- Common Event Format (CEF)
- Containerd
- CoreDNS
- Corelight
- Couchbase
- CouchDB
- Cribl
- CrowdStrike
- Cyberark
- Cybereason
- CylanceProtect Logs
- Custom Websocket logs
- Darktrace
- Data Exfiltration Detection
- DGA
- Digital Guardian
- Docker
- DomainTools Real Time Unified Feeds
- Elastic APM
- Elastic Fleet Server
- Elastic Security
- Elastic Stack monitoring
- Elasticsearch Service Billing
- Envoy Proxy
- ESET PROTECT
- ESET Threat Intelligence
- etcd
- Falco
- F5
- File Integrity Monitoring
- FireEye Network Security
- First EPSS
- Forcepoint Web Security
- ForgeRock
- Fortinet
- Gigamon
- GitHub
- GitLab
- Golang
- Google Cloud
- Custom GCS Input
- GCP
- GCP Audit logs
- GCP Billing metrics
- GCP Cloud Run metrics
- GCP CloudSQL metrics
- GCP Compute metrics
- GCP Dataproc metrics
- GCP DNS logs
- GCP Firestore metrics
- GCP Firewall logs
- GCP GKE metrics
- GCP Load Balancing metrics
- GCP Metrics Input
- GCP PubSub logs (custom)
- GCP PubSub metrics
- GCP Redis metrics
- GCP Security Command Center
- GCP Storage metrics
- GCP VPC Flow logs
- GCP Vertex AI
- GoFlow2 logs
- Hadoop
- HAProxy
- Hashicorp Vault
- HTTP Endpoint logs (custom)
- IBM MQ
- IIS
- Imperva
- InfluxDb
- Infoblox
- Iptables
- Istio
- Jamf Compliance Reporter
- Jamf Pro
- Jamf Protect
- Jolokia Input
- Journald logs (custom)
- JumpCloud
- Kafka
- Keycloak
- Kubernetes
- LastPass
- Lateral Movement Detection
- Linux Metrics
- Living off the Land Attack Detection
- Logs (custom)
- Lumos
- Lyve Cloud
- Mattermost
- Memcached
- Menlo Security
- Microsoft
- Microsoft 365
- Microsoft Defender for Cloud
- Microsoft Defender for Endpoint
- Microsoft DHCP
- Microsoft DNS Server
- Microsoft Entra ID Entity Analytics
- Microsoft Exchange Online Message Trace
- Microsoft Exchange Server
- Microsoft Graph Activity Logs
- Microsoft M365 Defender
- Microsoft Office 365 Metrics Integration
- Microsoft Sentinel
- Microsoft SQL Server
- Mimecast
- ModSecurity Audit
- MongoDB
- MongoDB Atlas
- MySQL
- Nagios XI
- NATS
- NetFlow Records
- Netskope
- Network Beaconing Identification
- Network Packet Capture
- Nginx
- Okta
- Oracle
- OpenAI
- OpenCanary
- Osquery
- Palo Alto
- pfSense
- PHP-FPM
- PingOne
- PingFederate
- Pleasant Password Server
- PostgreSQL
- Prometheus
- Proofpoint TAP
- Proofpoint On Demand
- Pulse Connect Secure
- Qualys VMDR
- QNAP NAS
- RabbitMQ Logs
- Radware DefensePro Logs
- Rapid7
- Redis
- Rubrik RSC Metrics Integration
- Sailpoint Identity Security Cloud
- Salesforce
- SentinelOne
- ServiceNow
- Slack Logs
- Snort
- Snyk
- SonicWall Firewall
- Sophos
- Spring Boot
- SpyCloud Enterprise Protection
- SQL Input
- Squid Logs
- SRX
- STAN
- Statsd Input
- Sublime Security
- Suricata
- StormShield SNS
- Symantec
- Symantec Endpoint Security
- Sysmon for Linux
- Sysdig
- Syslog Router Integration
- System
- System Audit
- Tanium
- TCP Logs (custom)
- Teleport
- Tenable
- Threat intelligence
- ThreatConnect
- Threat Map
- Thycotic Secret Server
- Tines
- Traefik
- Trellix
- Trend Micro
- TYCHON Agentless
- UDP Logs (custom)
- Universal Profiling
- Vectra Detect
- VMware
- WatchGuard Firebox
- WebSphere Application Server
- Windows
- Wiz
- Zeek
- ZeroFox
- Zero Networks
- ZooKeeper Metrics
- Zoom
- Zscaler
Airflow Integration
editAirflow Integration
editVersion |
0.10.0 [beta] This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features. (View all) |
Compatible Kibana version(s) |
8.13.0 or higher |
Supported Serverless project types |
Security |
Subscription level |
Basic |
Level of support |
Elastic |
Overview
editAirflow is an open-source platform for programmatically authoring, scheduling, and monitoring workflows. It allows users to define workflows as Directed Acyclic Graphs (DAGs) of tasks, which are then executed by the Airflow scheduler on an array of workers while following the specified dependencies.
Use the Airflow integration to:
- Collect detailed metrics from Airflow using StatsD to gain insights into system performance.
- Create informative visualizations to track usage trends, measure key metrics, and derive actionable business insights.
- Monitor your workflows' performance and status in real-time.
Data streams
editThe Airflow integration gathers metric data.
Metrics provide insight into the statistics of Airflow. The Metric
data stream collected by the Airflow integration is statsd
, enabling users to monitor and troubleshoot the performance of the Airflow instance.
Data stream:
-
statsd
: Collects metrics related to scheduler activities, pool usage, task execution details, executor performance, and worker states in Airflow.
Note:
-
Users can monitor and view metrics within the ingested documents for Airflow in the
metrics-*
index pattern fromDiscover
.
Compatibility
editThe Airflow module is tested with Airflow 2.4.0
. It should work with versions 2.0.0
and later.
Prerequisites
editUsers require Elasticsearch to store and search user data, and Kibana to visualize and manage it. They can utilize the hosted Elasticsearch Service on Elastic Cloud, which is recommended, or self-manage the Elastic Stack on their own hardware.
To ingest data from Airflow, users must have StatsD to receive the same.
Setup
editFor step-by-step instructions on how to set up an integration, see the Getting started guide.
Steps to Setup Airflow
editBe sure to follow the official Airflow Installation Guide for the correct installation of Airflow.
Include the following lines in the user’s Airflow configuration file (e.g. airflow.cfg
). Leave statsd_prefix
empty and replace %HOST%
with the address where the Agent is running:
[metrics] statsd_on = True statsd_host = %HOST% statsd_port = 8125 statsd_prefix =
Validation
editOnce the integration is set up, you can click on the Assets tab in the Airflow integration to see a list of available dashboards. Choose the dashboard that corresponds to your configured data stream. The dashboard should be populated with the required data.
Troubleshooting
edit- Check if the StatsD server is receiving data from Airflow by examining the logs for potential errors.
-
Make sure the
%HOST%
placeholder in the Airflow configuration file is replaced with the correct address of the machine where the StatsD server is running. -
If Airflow metrics are not being emitted, confirm that the
[metrics]
section in theairflow.cfg
file is properly configured as per the instructions above.
Metrics reference
editStatsd
editThis is the statsd
data stream, which collects metrics related to scheduler activities, pool usage, task execution details, executor performance, and worker states in Airflow.
Example
An example event for statsd
looks as following:
{ "@timestamp": "2024-06-18T07:24:40.220Z", "agent": { "ephemeral_id": "82e52250-5f2d-4fad-9f19-b88a209229db", "id": "97400795-188c-4140-a1ee-0002078c785d", "name": "docker-fleet-agent", "type": "metricbeat", "version": "8.13.0" }, "airflow": { "scheduler_critical_section_duration": { "count": 1, "max": 7, "mean": 7, "mean_rate": 0.25568506211340597, "median": 7, "min": 7, "stddev": 0 } }, "data_stream": { "dataset": "airflow.statsd", "namespace": "ep", "type": "metrics" }, "ecs": { "version": "8.11.0" }, "elastic_agent": { "id": "97400795-188c-4140-a1ee-0002078c785d", "snapshot": false, "version": "8.13.0" }, "event": { "agent_id_status": "verified", "dataset": "airflow.statsd", "ingested": "2024-06-18T07:24:50Z", "module": "statsd" }, "host": { "architecture": "x86_64", "containerized": true, "hostname": "docker-fleet-agent", "id": "8259e024976a406e8a54cdbffeb84fec", "ip": [ "192.168.245.7" ], "mac": [ "02-42-C0-A8-F5-07" ], "name": "docker-fleet-agent", "os": { "codename": "focal", "family": "debian", "kernel": "3.10.0-1160.102.1.el7.x86_64", "name": "Ubuntu", "platform": "ubuntu", "type": "linux", "version": "20.04.6 LTS (Focal Fossa)" } }, "metricset": { "name": "server" }, "service": { "type": "statsd" } }
ECS Field Reference
Please refer to the following document for detailed information on ECS fields.
Exported fields
Field | Description | Type | Metric Type |
---|---|---|---|
@timestamp |
Event timestamp. |
date |
|
agent.id |
keyword |
||
airflow.*.count |
Airflow counters |
object |
counter |
airflow.*.max |
Airflow max timers metric |
object |
|
airflow.*.mean |
Airflow mean timers metric |
object |
|
airflow.*.mean_rate |
Airflow mean rate timers metric |
object |
|
airflow.*.median |
Airflow median timers metric |
object |
|
airflow.*.min |
Airflow min timers metric |
object |
|
airflow.*.stddev |
Airflow standard deviation timers metric |
object |
|
airflow.*.value |
Airflow gauges |
object |
gauge |
airflow.dag_file |
Airflow dag file metadata |
keyword |
|
airflow.dag_id |
Airflow dag id metadata |
keyword |
|
airflow.job_name |
Airflow job name metadata |
keyword |
|
airflow.operator_name |
Airflow operator name metadata |
keyword |
|
airflow.pool_name |
Airflow pool name metadata |
keyword |
|
airflow.scheduler_heartbeat.count |
Airflow scheduler heartbeat |
double |
|
airflow.status |
Airflow status metadata |
keyword |
|
airflow.task_id |
Airflow task id metadata |
keyword |
|
cloud.account.id |
The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. |
keyword |
|
cloud.availability_zone |
Availability zone in which this host is running. |
keyword |
|
cloud.image.id |
Image ID for the cloud instance. |
keyword |
|
cloud.instance.id |
Instance ID of the host machine. |
keyword |
|
cloud.provider |
Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean. |
keyword |
|
cloud.region |
Region in which this host is running. |
keyword |
|
container.id |
Unique container id. |
keyword |
|
data_stream.dataset |
Data stream dataset. |
constant_keyword |
|
data_stream.namespace |
Data stream namespace. |
constant_keyword |
|
data_stream.type |
Data stream type. |
constant_keyword |
|
event.dataset |
Event dataset |
constant_keyword |
|
event.module |
Event module |
constant_keyword |
|
host.containerized |
If the host is a container. |
boolean |
|
host.name |
Name of the host. It can contain what |
keyword |
|
host.os.build |
OS build information. |
keyword |
|
host.os.codename |
OS codename, if any. |
keyword |
|
service.address |
Service address |
keyword |
Changelog
editChangelog
Version | Details | Kibana version(s) |
---|---|---|
0.10.0 |
Enhancement (View pull request) |
— |
0.9.1 |
Bug fix (View pull request) |
— |
0.9.0 |
Enhancement (View pull request) |
— |
0.8.0 |
Enhancement (View pull request) |
— |
0.7.0 |
Enhancement (View pull request) |
— |
0.6.0 |
Enhancement (View pull request) |
— |
0.5.1 |
Bug fix (View pull request) |
— |
0.5.0 |
Enhancement (View pull request) |
— |
0.4.0 |
Enhancement (View pull request) |
— |
0.3.1 |
Bug fix (View pull request) |
— |
0.3.0 |
Enhancement (View pull request) |
— |
0.2.0 |
Enhancement (View pull request) |
— |
0.1.0 |
Enhancement (View pull request) |
— |
0.0.5 |
Bug fix (View pull request) |
— |
0.0.4 |
Enhancement (View pull request) |
— |
0.0.3 |
Enhancement (View pull request) |
— |
0.0.2 |
Enhancement (View pull request) |
— |
0.0.1 |
Enhancement (View pull request) |
— |
On this page