Google Cloud Storage output plugin
editGoogle Cloud Storage output plugin
edit- Plugin version: v4.1.0
- Released on: 2019-08-08
- Changelog
For other versions, see the Versioned plugin docs.
Installation
editFor plugins not bundled by default, it is easy to install by running bin/logstash-plugin install logstash-output-google_cloud_storage
. See Working with plugins for more details.
Getting Help
editFor questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.
Description
editA plugin to upload log events to Google Cloud Storage (GCS), rolling files based on the date pattern provided as a configuration setting. Events are written to files locally and, once file is closed, this plugin uploads it to the configured bucket.
For more info on Google Cloud Storage, please go to: https://cloud.google.com/products/cloud-storage
In order to use this plugin, a Google service account must be used. For more information, please refer to: https://developers.google.com/storage/docs/authentication#service_accounts
Recommendation: experiment with the settings depending on how much log data you generate, so the uploader can keep up with the generated logs. Using gzip output can be a good option to reduce network traffic when uploading the log files and in terms of storage costs as well.
Usage
editThis is an example of logstash config:
output { google_cloud_storage { bucket => "my_bucket" (required) json_key_file => "/path/to/privatekey.json" (optional) temp_directory => "/tmp/logstash-gcs" (optional) log_file_prefix => "logstash_gcs" (optional) max_file_size_kbytes => 1024 (optional) output_format => "plain" (optional) date_pattern => "%Y-%m-%dT%H:00" (optional) flush_interval_secs => 2 (optional) gzip => false (optional) gzip_content_encoding => false (optional) uploader_interval_secs => 60 (optional) include_uuid => true (optional) include_hostname => true (optional) } }
Improvements TODO List
edit- Support logstash event variables to determine filename.
- Turn Google API code into a Plugin Mixin (like AwsConfig).
- There’s no recover method, so if logstash/plugin crashes, files may not be uploaded to GCS.
- Allow user to configure file name.
- Allow parallel uploads for heavier loads (+ connection configuration if exposed by Ruby API client)
Google_cloud_storage Output Configuration Options
editThis plugin supports the following configuration options plus the Common Options described later.
Setting | Input type | Required |
---|---|---|
Yes |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
No |
||
Deprecated |
||
Obsolete |
||
No |
||
No |
||
No |
||
string, one of |
Deprecated |
|
Deprecated |
||
No |
||
No |
Also see Common Options for a list of options supported by all output plugins.
bucket
edit- This is a required setting.
- Value type is string
- There is no default value for this setting.
GCS bucket name, without "gs://" or any other prefix.
date_pattern
edit- Value type is string
-
Default value is
"%Y-%m-%dT%H:00"
Time pattern for log file, defaults to hourly files. Must Time.strftime patterns: www.ruby-doc.org/core-2.0/Time.html#method-i-strftime
flush_interval_secs
edit- Value type is number
-
Default value is
2
Flush interval in seconds for flushing writes to log files. 0 will flush on every message.
gzip
edit- Value type is boolean
-
Default value is
false
Gzip output stream when writing events to log files, set
Content-Type
to application/gzip
instead of text/plain
, and
use file suffix .log.gz
instead of .log
.
gzip_content_encoding
editAdded in 3.3.0.
- Value type is boolean
-
Default value is
false
Gzip output stream when writing events to log files and set Content-Encoding
to gzip
.
This will upload your files as gzip
saving network and storage costs, but they will be
transparently decompressed when you read them from the storage bucket.
See the Cloud Storage documentation on metadata and transcoding for more information.
Note: It is not recommended to use both gzip_content_encoding
and gzip
.
This compresses your file twice, will increase the work your machine does and makes
the files larger than just compressing once.
include_hostname
editAdded in 3.1.0.
- Value type is boolean
-
Default value is
true
Should the hostname be included in the file name? You may want to turn this off for privacy reasons or if you are running multiple instances of Logstash and need to match the files you create with a simple glob such as if you wanted to import files to BigQuery.
include_uuid
editAdded in 3.1.0.
- Value type is boolean
-
Default value is
false
Adds a UUID to the end of a file name. You may want to enable this feature so files don’t clobber one another if you’re running multiple instances of Logstash or if you expect frequent node restarts.
json_key_file
edit- Value type is string
-
Default value is
nil
The plugin can use Application Default Credentials (ADC), if it’s running on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.
Outside of Google Cloud, you will need create a Service Account JSON key file through the
web interface or with the following command:
gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com
key_password
edit- Value type is string
-
Default value is
"notasecret"
Deprecated this feature is no longer used, the setting is now a part of json_key_file
.
key_path
edit- Value type is string
Obsolete: The PKCS12 key file format is no longer supported. Please use one of the following mechanisms:
- Application Default Credentials (ADC), configured via environment variables on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.
-
A JSON authentication key file. You can generate them in the console for the service account
like you did with the
.P12
file or with the following command:gcloud iam service-accounts keys create key.json --iam-account my-sa-123@my-project-123.iam.gserviceaccount.com
log_file_prefix
edit- Value type is string
-
Default value is
"logstash_gcs"
Log file prefix. Log file will follow the format: <prefix>_hostname_date<.part?>.log
max_concurrent_uploads
edit- Value type is number
-
Default value is
5
Sets the maximum number of concurrent uploads to Cloud Storage at a time. Uploads are I/O bound so it makes sense to tune this paramater with regards to the network bandwidth available and the latency between your server and Cloud Storage.
max_file_size_kbytes
edit- Value type is number
-
Default value is
10000
Sets max file size in kbytes. 0 disable max file check.
output_format
edit-
Value can be any of:
json
,plain
, or no value - Default value is no value
Deprecated, this feature will be removed in the next major release. Use codecs instead.
-
If you are using the
json
value today, switch to thejson_lines
codec. -
If you are using the
plain
value today, switch to theline
codec.
The event format you want to store in files. Defaults to plain text.
Note: if you want to use a codec you MUST not set this value.
service_account
edit- This is a required setting.
- Value type is string
- There is no default value for this setting.
Deprecated this feature is no longer used, the setting is now a part of json_key_file
.
Common Options
editThe following configuration options are supported by all output plugins:
codec
edit- Value type is codec
-
Default value is
"line"
The codec used for output data. Output codecs are a convenient method for encoding your data before it leaves the output without needing a separate filter in your Logstash pipeline.
enable_metric
edit- Value type is boolean
-
Default value is
true
Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.
id
edit- Value type is string
- There is no default value for this setting.
Add a unique ID
to the plugin configuration. If no ID is specified, Logstash will generate one.
It is strongly recommended to set this ID in your configuration. This is particularly useful
when you have two or more plugins of the same type. For example, if you have 2 google_cloud_storage outputs.
Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.
output { google_cloud_storage { id => "my_plugin_id" } }