Logstash Reference: other versions:
Logstash Introduction
Getting Started with Logstash
- Installing Logstash
- Stashing Your First Event
- Parsing Logs with Logstash
- Stitching Together Multiple Input and Output Plugins
How Logstash Works
- Execution Model
Setting Up and Running Logstash
- Logstash Directory Layout
- Logstash Configuration Files
- Settings File
- Running Logstash from the Command Line
- Running Logstash as a Service on Debian or RPM
- Running Logstash on Docker
- Logging
- Shutting Down Logstash
Setting Up X-Pack
- Installing X-Pack
- X-Pack Settings
Breaking changes
Upgrading Logstash
- Upgrading Using Package Managers
- Upgrading Using a Direct Download
- Upgrading Logstash to 5.0
- Upgrading with the Persistent Queue Enabled
Configuring Logstash
- Structure of a Config File
- Accessing Event Data and Fields in the Configuration
- Using Environment Variables in the Configuration
- Logstash Configuration Examples
- Reloading the Config File
- Managing Multiline Events
- Glob Pattern Support
- Converting Ingest Node Pipelines
Working with Logstash Modules
- ArcSight Module
- Netflow Module
Working with Filebeat Modules
- Configuration Examples
Data Resiliency
- Persistent Queues
- Dead Letter Queues
Transforming Data
- Performing Core Operations
- Deserializing Data
- Extracting Fields and Wrangling Data
- Enriching Data with Lookups
Deploying and Scaling Logstash
Performance Tuning
- Performance Troubleshooting Guide
- Tuning and Profiling Logstash Performance
Monitoring Logstash
- Monitoring UI
- Monitoring APIs
- Node Info API
- Plugins Info API
- Node Stats API
- Hot Threads API
Working with plugins
- Generating Plugins
- Offline Plugin Management
- Private Gem Repositories
- Event API
Input plugins
- Beats input plugin
- Cloudwatch input plugin
- Couchdb_changes input plugin
- Dead_letter_queue input plugin
- Drupal_dblog input plugin
- Elasticsearch input plugin
- Eventlog output plugin
- Exec input plugin
- File input plugin
- Ganglia input plugin
- Gelf input plugin
- Gemfire input plugin
- Generator input plugin
- Github input plugin
- Google_pubsub input plugin
- Graphite input plugin
- Heartbeat input plugin
- Http input plugin
- Http_poller input plugin
- Imap input plugin
- Irc input plugin
- Jdbc input plugin
- Jms input plugin
- Jmx input plugin
- Kafka input plugin
- Kinesis input plugin
- Log4j input plugin
- Lumberjack input plugin
- Meetup input plugin
- Pipe input plugin
- Puppet_facter input plugin
- Rabbitmq input plugin
- rackspace input plugin
- Redis input plugin
- Relp input plugin
- Rss input plugin
- S3 input plugin
- Salesforce input plugin
- Snmptrap input plugin
- Sqlite input plugin
- Sqs input plugin
- Stdin input plugin
- Stomp input plugin
- Syslog input plugin
- Tcp input plugin
- Twitter input plugin
- Udp input plugin
- Unix input plugin
- Varnishlog input plugin
- Websocket input plugin
- Wmi input plugin
- Xmpp input plugin
- Zenoss input plugin
- Zeromq input plugin
Output plugins
- Boundary output plugin
- Circonus output plugin
- Cloudwatch output plugin
- Csv output plugin
- Datadog output plugin
- Datadog_metrics output plugin
- Elasticsearch output plugin
- Email output plugin
- Exec output plugin
- File output plugin
- Ganglia output plugin
- Gelf output plugin
- Google BigQuery output plugin
- Google_cloud_storage output plugin
- Graphite output plugin
- Graphtastic output plugin
- Http output plugin
- Influxdb output plugin
- Irc output plugin
- Jira output plugin
- Juggernaut output plugin
- Kafka output plugin
- Librato output plugin
- Loggly output plugin
- Lumberjack output plugin
- Metriccatcher output plugin
- Mongodb output plugin
- Nagios output plugin
- Nagios_nsca output plugin
- Newrelic output plugin
- Opentsdb output plugin
- Pagerduty output plugin
- Pipe output plugin
- Rabbitmq output plugin
- Rackspace output plugin
- Redis output plugin
- Redmine output plugin
- Riak output plugin
- Riemann output plugin
- S3 output plugin
- Sns output plugin
- Solr_http output plugin
- Sqs output plugin
- Statsd output plugin
- Stdout output plugin
- Stomp output plugin
- Syslog output plugin
- Tcp output plugin
- Udp output plugin
- Webhdfs output plugin
- Websocket output plugin
- Xmpp output plugin
- Zabbix output plugin
- Zeromq output plugin
Filter plugins
- Aggregate filter plugin
- Alter filter plugin
- Anonymize filter plugin
- Cidr filter plugin
- Cipher filter plugin
- Clone filter plugin
- Collate filter plugin
- Csv filter plugin
- Date filter plugin
- De_dot filter plugin
- Dissect filter plugin
- Dns filter plugin
- Drop filter plugin
- Elapsed filter plugin
- Elasticsearch filter plugin
- Environment filter plugin
- Extractnumbers filter plugin
- Fingerprint filter plugin
- Geoip filter plugin
- Grok filter plugin
- I18n filter plugin
- Jdbc_streaming filter plugin
- Json filter plugin
- Json_encode filter plugin
- Kv filter plugin
- Metaevent filter plugin
- Metricize filter plugin
- Metrics filter plugin
- Mutate filter plugin
- Oui filter plugin
- Prune filter plugin
- Punct filter plugin
- Range filter plugin
- Ruby filter plugin
- Sleep filter plugin
- Split filter plugin
- Syslog_pri filter plugin
- Throttle filter plugin
- Tld filter plugin
- Translate filter plugin
- Truncate filter plugin
- Urldecode filter plugin
- Useragent filter plugin
- Uuid filter plugin
- Xml filter plugin
- Yaml filter plugin
- Zeromq filter plugin
Codec plugins
- Avro codec plugin
- Cef codec plugin
- Cloudfront codec plugin
- Cloudtrail codec plugin
- Collectd codec plugin
- Compress_spooler codec plugin
- Dots codec plugin
- Edn codec plugin
- Edn_lines codec plugin
- Es_bulk codec plugin
- Fluent codec plugin
- Graphite codec plugin
- Gzip_lines codec plugin
- Json codec plugin
- Json_lines codec plugin
- Line codec plugin
- Msgpack codec plugin
- Multiline codec plugin
- Netflow codec plugin
- Nmap codec plugin
- Oldlogstashjson codec plugin
- Plain codec plugin
- Protobuf codec plugin
- Rubydebug codec plugin
Contributing to Logstash
- How to write a Logstash input plugin
- How to write a Logstash input plugin
- How to write a Logstash codec plugin
- How to write a Logstash filter plugin
- Contributing a Patch to a Logstash Plugin
- Logstash Plugins Community Maintainer Guide
- Submitting your plugin to RubyGems.org and the logstash-plugins repository
Glossary of Terms
Release Notes
- Logstash 5.6.16 Release Notes
- Logstash 5.6.15 Release Notes
- Logstash 5.6.14 Release Notes
- Logstash 5.6.13 Release Notes
- Logstash 5.6.12 Release Notes
- Logstash 5.6.11 Release Notes
- Logstash 5.6.10 Release Notes
- Logstash 5.6.9 Release Notes
- Logstash 5.6.8 Release Notes
- Logstash 5.6.7 Release Notes
- Logstash 5.6.6 Release Notes
- Logstash 5.6.5 Release Notes
- Logstash 5.6.4 Release Notes
- Logstash 5.6.3 Release Notes
- Logstash 5.6.2 Release Notes
- Logstash 5.6.1 Release Notes
- Logstash 5.6.0 Release Notes

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Gelf output plugin Google_cloud_storage output plugin »

› ›

Google BigQuery output plugin

edit

Google BigQuery output plugin

edit

Plugin version: v4.1.1
Released on: 2018-10-25
Changelog

Installation

edit

For plugins not bundled by default, it is easy to install by running bin/logstash-plugin install logstash-output-google_bigquery. See Working with plugins for more details.

Getting Help

edit

For questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.

Description

edit

Summary

edit

This Logstash plugin uploads events to Google BigQuery using the streaming API so data can become available to query nearly immediately.

You can configure it to flush periodically, after N events or after a certain amount of data is ingested.

Environment Configuration

edit

You must enable BigQuery on your Google Cloud account and create a dataset to hold the tables this plugin generates.

You must also grant the service account this plugin uses access to the dataset.

You can use Logstash conditionals and multiple configuration blocks to upload events with different structures.

Usage

edit

This is an example of Logstash config:

output {
   google_bigquery {
     project_id => "folkloric-guru-278"                        (required)
     dataset => "logs"                                         (required)
     csv_schema => "path:STRING,status:INTEGER,score:FLOAT"    (required) 
     json_key_file => "/path/to/key.json"                      (optional) 
     error_directory => "/tmp/bigquery-errors"                 (required)
     date_pattern => "%Y-%m-%dT%H:00"                          (optional)
     flush_interval_secs => 30                                 (optional)
   }
}

	Specify either a csv_schema or a json_schema.
	If the key is not used, then the plugin tries to find Application Default Credentials

Considerations

edit

There is a small fee to insert data into BigQuery using the streaming API.
This plugin buffers events in-memory, so make sure the flush configurations are appropriate for your use-case and consider using Logstash Persistent Queues.
Events will be flushed when batch_size, batch_size_bytes, or flush_interval_secs is met, whatever comes first. If you notice a delay in your processing or low throughput, try adjusting those settings.

Additional Resources

edit

Google BigQuery Output Configuration Options

edit

This plugin supports the following configuration options plus the Common Options described later.

Setting	Input type	Required
`batch_size`	number	No
`batch_size_bytes`	number	No
`csv_schema`	string	No
`dataset`	string	Yes
`date_pattern`	string	No
`deleter_interval_secs`	number	Deprecated
`error_directory`	string	Yes
`flush_interval_secs`	number	No
`ignore_unknown_values`	boolean	No
`json_key_file`	string	No
`json_schema`	hash	No
`key_password`	string	Deprecated
`key_path`	string	Obsolete
`project_id`	string	Yes
`service_account`	string	Deprecated
`skip_invalid_rows`	boolean	No
`table_prefix`	string	No
`table_separator`	string	No
`temp_directory`	string	Deprecated
`temp_file_prefix`	string	Deprecated
`uploader_interval_secs`	number	Deprecated

Also see Common Options for a list of options supported by all output plugins.

`batch_size`

edit

Added in 4.0.0.

Value type is number
Default value is 128

The maximum number of messages to upload at a single time. This number must be < 10,000. Batching can increase performance and throughput to a point, but at the cost of per-request latency. Too few rows per request and the overhead of each request can make ingestion inefficient. Too many rows per request and the throughput may drop. BigQuery recommends using about 500 rows per request, but experimentation with representative data (schema and data sizes) will help you determine the ideal batch size.

`batch_size_bytes`

edit

Added in 4.0.0.

Value type is number
Default value is 1_000_000

An approximate number of bytes to upload as part of a batch. This number should be < 10MB or inserts may fail.

`csv_schema`

edit

Value type is string
Default value is nil

Schema for log data. It must follow the format name1:type1(,name2:type2)*. For example, path:STRING,status:INTEGER,score:FLOAT.

`dataset`

edit

This is a required setting.
Value type is string
There is no default value for this setting.

The BigQuery dataset the tables for the events will be added to.

`date_pattern`

edit

Value type is string
Default value is "%Y-%m-%dT%H:00"

Time pattern for BigQuery table, defaults to hourly tables. Must Time.strftime patterns: www.ruby-doc.org/core-2.0/Time.html#method-i-strftime

`deleter_interval_secs`

edit

Deprecated in 4.0.0.

Events are uploaded in real-time without being stored to disk.

Value type is number

`error_directory`

edit

Added in 4.0.0.

This is a required setting.
Value type is string
Default value is "/tmp/bigquery".

The location to store events that could not be uploaded due to errors. By default if any message in an insert is invalid all will fail. You can use skip_invalid_rows to allow partial inserts.

Consider using an additional Logstash input to pipe the contents of these to an alert platform so you can manually fix the events.

Or use GCS FUSE to transparently upload to a GCS bucket.

Files names follow the pattern [table name]-[UNIX timestamp].log

`flush_interval_secs`

edit

Value type is number
Default value is 5

Uploads all data this often even if other upload criteria aren’t met.

`ignore_unknown_values`

edit

Value type is boolean
Default value is false

Indicates if BigQuery should ignore values that are not represented in the table schema. If true, the extra values are discarded. If false, BigQuery will reject the records with extra fields and the job will fail. The default value is false.

You may want to add a Logstash filter like the following to remove common fields it adds:

mutate {
    remove_field => ["@version","@timestamp","path","host","type", "message"]
}

`json_key_file`

edit

Added in 4.0.0.

Replaces key_password

Value type is string
Default value is nil

If Logstash is running within Google Compute Engine, the plugin can use GCE’s Application Default Credentials. Outside of GCE, you will need to specify a Service Account JSON key file.

`json_schema`

edit

Value type is hash
Default value is nil

Schema for log data as a hash. These can include nested records, descriptions, and modes.

Example:

json_schema => {
  fields => [{
    name => "endpoint"
    type => "STRING"
    description => "Request route"
  }, {
    name => "status"
    type => "INTEGER"
    mode => "NULLABLE"
  }, {
    name => "params"
    type => "RECORD"
    mode => "REPEATED"
    fields => [{
      name => "key"
      type => "STRING"
     }, {
      name => "value"
      type => "STRING"
    }]
  }]
}

`key_password`

edit

Deprecated in 4.0.0.

Replaced by json_key_file or by using ADC. See json_key_file

Value type is string

`key_path`

edit

Value type is string

Obsolete: The PKCS12 key file format is no longer supported.

Please use one of the following mechanisms:

Application Default Credentials (ADC), configured via environment variables on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.
A JSON authentication key file. You can generate them in the console for the service account like you did with the .P12 file or with the following command: gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com

`project_id`

edit

This is a required setting.
Value type is string
There is no default value for this setting.

Google Cloud Project ID (number, not Project Name!).

`service_account`

edit

Deprecated in 4.0.0.

Replaced by json_key_file or by using ADC. See json_key_file

Value type is string

`skip_invalid_rows`

edit

Added in 4.1.0.

Value type is boolean
Default value is false

Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.

`table_prefix`

edit

Value type is string
Default value is "logstash"

BigQuery table ID prefix to be used when creating new tables for log data. Table name will be <table_prefix><table_separator><date>

`table_separator`

edit

Value type is string
Default value is "_"

BigQuery table separator to be added between the table_prefix and the date suffix.

`temp_directory`

edit

Deprecated in 4.0.0.

Events are uploaded in real-time without being stored to disk.

Value type is string

`temp_file_prefix`

edit

Deprecated in 4.0.0.

Events are uploaded in real-time without being stored to disk

Value type is string

`uploader_interval_secs`

edit

Deprecated in 4.0.0.

This field is no longer used

Value type is number
Default value is 60

Uploader interval when uploading new files to BigQuery. Adjust time based on your time pattern (for example, for hourly files, this interval can be around one hour).

Common Options

edit

The following configuration options are supported by all output plugins:

Setting	Input type	Required
`codec`	codec	No
`enable_metric`	boolean	No
`id`	string	No
`workers`	number	No

`codec`

edit

Value type is codec
Default value is "plain"

The codec used for output data. Output codecs are a convenient method for encoding your data before it leaves the output, without needing a separate filter in your Logstash pipeline.

`enable_metric`

edit

Value type is boolean
Default value is true

Disable or enable metric logging for this specific plugin instance by default we record all the metrics we can, but you can disable metrics collection for a specific plugin.

`id`

edit

Value type is string
There is no default value for this setting.

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. It is strongly recommended to set this ID in your configuration. This is particularly useful when you have two or more plugins of the same type, for example, if you have 2 google_bigquery outputs. Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.

output {
  google_bigquery {
    id => "my_plugin_id"
  }
}

`workers`

edit

Value type is string
Default value is 1

« Gelf output plugin Google_cloud_storage output plugin »

Was this helpful?

Feedback

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

Google BigQuery output plugin

Google BigQuery output plugin

Installation

Getting Help

Description

Summary

Environment Configuration

Usage

Considerations

Additional Resources

Google BigQuery Output Configuration Options

batch_size

batch_size_bytes

csv_schema

dataset

date_pattern

deleter_interval_secs

error_directory

flush_interval_secs

ignore_unknown_values

json_key_file

json_schema

key_password

key_path

project_id

service_account

skip_invalid_rows

table_prefix

table_separator

temp_directory

temp_file_prefix

uploader_interval_secs

Common Options

codec

enable_metric

id

workers

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

`batch_size`

`batch_size_bytes`

`csv_schema`

`dataset`

`date_pattern`

`deleter_interval_secs`

`error_directory`

`flush_interval_secs`

`ignore_unknown_values`

`json_key_file`

`json_schema`

`key_password`

`key_path`

`project_id`

`service_account`

`skip_invalid_rows`

`table_prefix`

`table_separator`

`temp_directory`

`temp_file_prefix`

`uploader_interval_secs`

`codec`

`enable_metric`

`id`

`workers`