Google Cloud Storage output plugin v4.4.0

edit
  • Plugin version: v4.4.0
  • Released on: 2023-08-22
  • Changelog

For other versions, see the overview list.

To learn more about Logstash, see the Logstash Reference.

Getting help

edit

For questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.

Description

edit

A plugin to upload log events to Google Cloud Storage (GCS), rolling files based on the date pattern provided as a configuration setting. Events are written to files locally and, once file is closed, this plugin uploads it to the configured bucket.

For more info on Google Cloud Storage, please go to: https://cloud.google.com/products/cloud-storage

In order to use this plugin, a Google service account must be used. For more information, please refer to: https://developers.google.com/storage/docs/authentication#service_accounts

Recommendation: experiment with the settings depending on how much log data you generate, so the uploader can keep up with the generated logs. Using gzip output can be a good option to reduce network traffic when uploading the log files and in terms of storage costs as well.

Usage

edit

This is an example of logstash config:

output {
   google_cloud_storage {
     bucket => "my_bucket"                                     (required)
     json_key_file => "/path/to/privatekey.json"               (optional)
     temp_directory => "/tmp/logstash-gcs"                     (optional)
     log_file_prefix => "logstash_gcs"                         (optional)
     max_file_size_kbytes => 1024                              (optional)
     output_format => "plain"                                  (optional)
     date_pattern => "%Y-%m-%dT%H:00"                          (optional)
     flush_interval_secs => 2                                  (optional)
     gzip => false                                             (optional)
     gzip_content_encoding => false                            (optional)
     uploader_interval_secs => 60                              (optional)
     include_uuid => true                                      (optional)
     include_hostname => true                                  (optional)
   }
}

Improvements TODO List

edit
  • Support logstash event variables to determine filename.
  • Turn Google API code into a Plugin Mixin (like AwsConfig).
  • There’s no recover method, so if logstash/plugin crashes, files may not be uploaded to GCS.
  • Allow user to configure file name.
  • Allow parallel uploads for heavier loads (+ connection configuration if exposed by Ruby API client)

Google_cloud_storage Output Configuration Options

edit

This plugin supports the following configuration options plus the Common options described later.

Also see Common options for a list of options supported by all output plugins.

 

bucket

edit
  • This is a required setting.
  • Value type is string
  • There is no default value for this setting.

GCS bucket name, without "gs://" or any other prefix.

date_pattern

edit
  • Value type is string
  • Default value is "%Y-%m-%dT%H:00"

Time pattern for log file, defaults to hourly files. Must Time.strftime patterns: www.ruby-doc.org/core-2.0/Time.html#method-i-strftime

flush_interval_secs

edit
  • Value type is number
  • Default value is 2

Flush interval in seconds for flushing writes to log files. 0 will flush on every message.

gzip

edit
  • Value type is boolean
  • Default value is false

Gzip output stream when writing events to log files, set Content-Type to application/gzip instead of text/plain, and use file suffix .log.gz instead of .log.

gzip_content_encoding

edit

Added in 3.3.0.

  • Value type is boolean
  • Default value is false

Gzip output stream when writing events to log files and set Content-Encoding to gzip. This will upload your files as gzip saving network and storage costs, but they will be transparently decompressed when you read them from the storage bucket.

See the Cloud Storage documentation on metadata and transcoding for more information.

Note: It is not recommended to use both gzip_content_encoding and gzip. This compresses your file twice, will increase the work your machine does and makes the files larger than just compressing once.

include_hostname

edit

Added in 3.1.0.

  • Value type is boolean
  • Default value is true

Should the hostname be included in the file name? You may want to turn this off for privacy reasons or if you are running multiple instances of Logstash and need to match the files you create with a simple glob such as if you wanted to import files to BigQuery.

include_uuid

edit

Added in 3.1.0.

  • Value type is boolean
  • Default value is false

Adds a UUID to the end of a file name. You may want to enable this feature so files don’t clobber one another if you’re running multiple instances of Logstash or if you expect frequent node restarts.

json_key_file

edit
  • Value type is string
  • Default value is nil

The plugin can use Application Default Credentials (ADC), if it’s running on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.

Outside of Google Cloud, you will need create a Service Account JSON key file through the web interface or with the following command: gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com

key_password

edit
  • Value type is string
  • Default value is "notasecret"

Deprecated this feature is no longer used, the setting is now a part of json_key_file.

key_path

edit

Obsolete: The PKCS12 key file format is no longer supported. Please use one of the following mechanisms:

  • Application Default Credentials (ADC), configured via environment variables on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.
  • A JSON authentication key file. You can generate them in the console for the service account like you did with the .P12 file or with the following command: gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com

log_file_prefix

edit
  • Value type is string
  • Default value is "logstash_gcs"

Log file prefix. Log file will follow the format: <prefix>_hostname_date<.part?>.log

max_concurrent_uploads

edit
  • Value type is number
  • Default value is 5

Sets the maximum number of concurrent uploads to Cloud Storage at a time. Uploads are I/O bound so it makes sense to tune this paramater with regards to the network bandwidth available and the latency between your server and Cloud Storage.

max_file_size_kbytes

edit
  • Value type is number
  • Default value is 10000

Sets max file size in kbytes. 0 disable max file check.

output_format

edit
  • Value can be any of: json, plain, or no value
  • Default value is no value

Deprecated, this feature will be removed in the next major release. Use codecs instead.

  • If you are using the json value today, switch to the json_lines codec.
  • If you are using the plain value today, switch to the line codec.

The event format you want to store in files. Defaults to plain text.

Note: if you want to use a codec you MUST not set this value.

service_account

edit
  • This is a required setting.
  • Value type is string
  • There is no default value for this setting.

Deprecated this feature is no longer used, the setting is now a part of json_key_file.

temp_directory

edit
  • Value type is string
  • Default value is ""

Directory where temporary files are stored. Defaults to /tmp/logstash-gcs-<random-suffix>

uploader_interval_secs

edit
  • Value type is number
  • Default value is 60

Uploader interval when uploading new files to GCS. Adjust time based on your time pattern (for example, for hourly files, this interval can be around one hour).

Common options

edit

These configuration options are supported by all output plugins:

Setting Input type Required

codec

codec

No

enable_metric

boolean

No

id

string

No

codec

edit
  • Value type is codec
  • Default value is "line"

The codec used for output data. Output codecs are a convenient method for encoding your data before it leaves the output without needing a separate filter in your Logstash pipeline.

enable_metric

edit
  • Value type is boolean
  • Default value is true

Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.

  • Value type is string
  • There is no default value for this setting.

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. It is strongly recommended to set this ID in your configuration. This is particularly useful when you have two or more plugins of the same type. For example, if you have 2 google_cloud_storage outputs. Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.

output {
  google_cloud_storage {
    id => "my_plugin_id"
  }
}