jdbc
editjdbc
editThis plugin was created as a way to ingest data in any database
with a JDBC interface into Logstash. You can periodically schedule ingestion
using a cron syntax (see schedule
setting) or run the query one time to load
data into Logstash. Each row in the resultset becomes a single event.
Columns in the resultset are converted into fields in the event.
Drivers
editThis plugin does not come packaged with JDBC driver libraries. The desired
jdbc driver library must be explicitly passed in to the plugin using the
jdbc_driver_library
configuration option.
Scheduling
editInput from this plugin can be scheduled to run periodically according to a specific schedule. This scheduling syntax is powered by rufus-scheduler. The syntax is cron-like with some extensions specific to Rufus (e.g. timezone support).
Examples:
|
will execute every minute of 5am every day of January through March. |
|
will execute on the 0th minute of every hour every day. |
|
will execute at 6:00am (UTC/GMT -5) every day. |
Further documentation describing this syntax can be found here.
State
editThe plugin will persist the sql_last_start
parameter in the form of a
metadata file stored in the configured last_run_metadata_path
. Upon shutting down,
this file will be updated with the current value of sql_last_start
. Next time
the pipeline starts up, this value will be updated by reading from the file. If
clean_run
is set to true, this value will be ignored and sql_last_start
will be
set to Jan 1, 1970, as if no query has ever been executed.
Dealing With Large Result-sets
editMany JDBC drivers use the fetch_size
parameter to limit how many
results are pre-fetched at a time from the cursor into the client’s cache
before retrieving more results from the result-set. This is configured in
this plugin using the jdbc_fetch_size
configuration option. No fetch size
is set by default in this plugin, so the specific driver’s default size will
be used.
Usage:
editHere is an example of setting up the plugin to fetch data from a MySQL database.
First, we place the appropriate JDBC driver library in our current
path (this can be placed anywhere on your filesystem). In this example, we connect to
the mydb database using the user: mysql and wish to input all rows in the songs
table that match a specific artist. The following examples demonstrates a possible
Logstash configuration for this. The schedule
option in this example will
instruct the plugin to execute this input statement on the minute, every minute.
input { jdbc { jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar" jdbc_driver_class => "com.mysql.jdbc.Driver" jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb" jdbc_user => "mysql" parameters => { "favorite_artist" => "Beethoven" } schedule => "* * * * *" statement => "SELECT * from songs where artist = :favorite_artist" } }
Configuring SQL statement
editA sql statement is required for this input. This can be passed-in via a
statement option in the form of a string, or read from a file (statement_filepath
). File
option is typically used when the SQL statement is large or cumbersome to supply in the config.
The file option only supports one SQL statement. The plugin will only accept one of the options.
It cannot read a statement from a file as well as from the statement
configuration parameter.
Predefined Parameters
editSome parameters are built-in and can be used from within your queries. Here is the list:
sql_last_start |
The last time a statement was executed. This is set to Thursday, 1 January 1970 before any query is run, and updated accordingly after first query is run. |
Synopsis
editThis plugin supports the following configuration options:
Required configuration options:
jdbc { jdbc_connection_string => ... jdbc_driver_class => ... jdbc_user => ... }
Available configuration options:
Setting | Input type | Required | Default value |
---|---|---|---|
No |
|
||
No |
|
||
No |
|
||
Yes |
|||
Yes |
|||
a valid filesystem path |
No |
||
No |
|||
No |
|
||
No |
|
||
No |
|||
No |
|
||
Yes |
|||
No |
|
||
No |
|
||
No |
|
||
No |
|
||
No |
|
||
No |
|||
No |
|
||
No |
|||
a valid filesystem path |
No |
||
No |
|||
No |
Details
edit
clean_run
edit- Value type is boolean
-
Default value is
false
Whether the previous run state should be preserved
codec
edit- Value type is codec
-
Default value is
"plain"
The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.
jdbc_connection_string
edit- This is a required setting.
- Value type is string
- There is no default value for this setting.
JDBC connection string
jdbc_driver_class
edit- This is a required setting.
- Value type is string
- There is no default value for this setting.
JDBC driver class to load, for example, "org.apache.derby.jdbc.ClientDriver"
. If you are using
the Oracle JDBC driver (ojdbc6.jar
) the correct jdbc_driver_class
is "Java::oracle.jdbc.driver.OracleDriver"
jdbc_driver_library
edit- Value type is path
- There is no default value for this setting.
JDBC driver library path to third party driver library. If not provided, Plugin will look for the driver class in the Logstash Java classpath.
jdbc_fetch_size
edit- Value type is number
- There is no default value for this setting.
JDBC fetch size. if not provided, respective driver’s default will be used
jdbc_paging_enabled
edit- Value type is boolean
-
Default value is
false
JDBC enable paging. This will cause a sql statement to be broken up into multiple queries. Each query will use limits and offsets to collectively retrieve the full result-set. The limit size is set with jdbc_page_size
.
jdbc_pool_timeout
edit- Value type is number
-
Default value is
5
The amount of seconds to wait to acquire a connection before raising a PoolTimeoutError (default 5)
jdbc_user
edit- This is a required setting.
- Value type is string
- There is no default value for this setting.
JDBC user
jdbc_validate_connection
edit- Value type is boolean
-
Default value is
false
Connection pool configuration. Validate connection before use.
jdbc_validation_timeout
edit- Value type is number
-
Default value is
3600
Connection pool configuration. How often to validate a connection (in seconds)
last_run_metadata_path
edit- Value type is string
-
Default value is
"$USER_HOME/.logstash_jdbc_last_run"
Path to file with last run time
parameters
edit- Value type is hash
-
Default value is
{}
Hash of query parameter, for example { "target_id" => "321" }
record_last_run
edit- Value type is boolean
-
Default value is
true
Whether to save state or not in last_run_metadata_path
schedule
edit- Value type is string
- There is no default value for this setting.
Schedule of when to periodically run statement, in Cron format for example: "* * * * *" (execute query every minute, on the minute)
There is no schedule by default. If no schedule is given, then the statement is run exactly once.
sequel_opts
edit- Value type is hash
-
Default value is
{}
General/Vendor-specific Sequel configuration options. An example of an optional connection pool configuration
max_connections
- The maximum number of connections the connection pool.
Examples of vendor-specific options can be found in this documentation page: https://github.com/jeremyevans/sequel/blob/master/doc/opening_databases.rdoc
statement
edit- Value type is string
- There is no default value for this setting.
If undefined, Logstash will complain, even if codec is unused. Statement to execute
To use parameters, use named parameter syntax. For example:
"SELECT * FROM MYTABLE WHERE id = :target_id"
here, ":target_id" is a named parameter. You can configure named parameters
with the parameters
setting.
statement_filepath
edit- Value type is path
- There is no default value for this setting.
Path of file containing statement to execute
tags
edit- Value type is array
- There is no default value for this setting.
Add any number of arbitrary tags to your event.
This can help with processing later.
type
edit- Value type is string
- There is no default value for this setting.
Add a type
field to all events handled by this input.
Types are used mainly for filter activation.
The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.
If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.