WARNING: Version 5.4 of Packetbeat has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Kafka Output
editKafka Output
editThe Kafka output sends the events to Apache Kafka.
Example configuration:
output.kafka: # initial brokers for reading cluster metadata hosts: ["kafka1:9092", "kafka2:9092", "kafka3:9092"] # message topic selection + partitioning topic: '%{[type]}' partition.round_robin: reachable_only: false required_acks: 1 compression: gzip max_message_bytes: 1000000
Events bigger than max_message_bytes
will be dropped. To avoid this problem, make sure Packetbeat does not generate events bigger than max_message_bytes
.
Compatibility
editThis output works with Kafka 0.8, 0.9, and 0.10.
Kafka Output Options
editYou can specify the following options in the kafka
section of the packetbeat.yml
config file:
enabled
editThe enabled config is a boolean setting to enable or disable the output. If set to false, the output is disabled.
The default value is true.
hosts
editThe list of Kafka broker addresses from where to fetch the cluster metadata. The cluster metadata contain the actual Kafka brokers events are published to.
version
editKafka version $packetbeat is assumed to run against. Defaults to oldest supported stable version (currently version 0.8.2.0).
Event timestamps will be added, if version 0.10.0.0+ is enabled.
Valid values are 0.8.2.0
, 0.8.2.1
, 0.8.2.2
, 0.8.2
, 0.8
, 0.9.0.0
,
0.9.0.1
, 0.9.0
, 0.9
, 0.10.0.0
, 0.10.0
, and 0.10
.
username
editThe username for connecting to Kafka. If username is configured, the password must be configured as well. Only SASL/PLAIN is supported.
password
editThe password for connecting to Kafka.
topic
editThe Kafka topic used for produced events. The setting can be a format string
using any event field. To set the topic from document type use %{[type]}
.
topics
editArray of topic selector rules supporting conditionals, format string
based field access and name mappings. The first rule matching will be used to
set the topic
for the event to be published. If topics
is missing or no
rule matches, the topic
field will be used.
Rule settings:
topic
: The topic format string to use. If the fields used are missing, the
rule fails.
mapping
: Dictionary mapping index names to new names
default
: Default string value if mapping
does not find a match.
when
: Condition which must succeed in order to execute the current rule.
key
editOptional Kafka event key. If configured, the event key must be unique and can be extracted from the event using a format string.
partition
editKafka output broker event partitioning strategy. Must be one of random
,
round_robin
, or hash
. By default the hash
partitioner is used.
random.group_events
: Sets the number of events to be published to the same
partition, before the partitioner selects a new partition by random. The
default value is 1 meaning after each event a new partition is picked randomly.
round_robin.group_events
: Sets the number of events to be published to the
same partition, before the partitioner selects the next partition. The default
value is 1 meaning after each event the next partition will be selected.
hash.hash
: List of fields used to compute the partitioning hash value from.
If no field is configured, the events key
value will be used.
hash.random
: Randomly distribute events if no hash or key value can be computed.
All partitioners will try to publish events to all partitions by default. If a
partition’s leader becomes unreachable for the beat, the output might block. All
partitioners support setting reachable_only
to overwrite this
behavior. If reachable_only
is set to true
, events will be published to
available partitions only.
Publishing to a subset of available partitions potentially increases resource usage because events may become unevenly distributed.
client_id
editThe configurable ClientID used for logging, debugging, and auditing purposes. The default is "beats".
worker
editThe number of concurrent load-balanced Kafka output workers.
codec
editOutput codec configuration. If the codec
section is missing, events will be json encoded.
See Output Codec for more information.
metadata
editKafka metadata update settings. The metadata do contain information about brokers, topics, partition, and active leaders to use for publishing.
-
refresh_frequency
- Metadata refresh interval. Defaults to 10 minutes.
-
retry.max
- Total number of metadata update retries when cluster is in middle of leader election. The default is 3.
-
retry.backoff
- Waiting time between retries during leader elections. Default is 250ms.
max_retries
editThe number of times to retry publishing an event after a publishing failure.
After the specified number of retries, the events are typically dropped.
Some Beats, such as Filebeat, ignore the max_retries
setting and retry until all
events are published.
Set max_retries
to a value less than 0 to retry until all events are published.
The default is 3.
bulk_max_size
editThe maximum number of events to bulk in a single Kafka request. The default is 2048.
timeout
editThe number of seconds to wait for responses from the Kafka brokers before timing out. The default is 30 (seconds).
broker_timeout
editThe maximum duration a broker will wait for number of required ACKs. The default is 10s.
channel_buffer_size
editPer Kafka broker number of messages buffered in output pipeline. The default is 256.
keep_alive
editThe keep-alive period for an active network connection. If 0s, keep-alives are disabled. The default is 0 seconds.
compression
editSets the output compression codec. Must be one of none
, snappy
and gzip
. The default is gzip
.
max_message_bytes
editThe maximum permitted size of JSON-encoded messages. Bigger messages will be dropped. The default value is 1000000 (bytes). This value should be equal to or less than the broker’s message.max.bytes
.
required_acks
editThe ACK reliability level required from broker. 0=no response, 1=wait for local commit, -1=wait for all replicas to commit. The default is 1.
Note: If set to 0, no ACKs are returned by Kafka. Messages might be lost silently on error.
flush_interval
editThe number of seconds to wait for new events between two producer API calls.