file

edit

Stream events from files, normally by tailing them in a manner similar to tail -0F but optionally reading them from the beginning.

By default, each event is assumed to be one line. If you would like to join multiple log lines into one event, you’ll want to use the multiline codec or filter.

The plugin aims to track changing files and emit new content as it’s appended to each file. It’s not well-suited for reading a file from beginning to end and storing all of it in a single event (not even with the multiline codec or filter).

Tracking of current position in watched files

edit

The plugin keeps track of the current position in each file by recording it in a separate file named sincedb. This makes it possible to stop and restart Logstash and have it pick up where it left off without missing the lines that were added to the file while Logstash was stopped.

By default, the sincedb file is placed in the home directory of the user running Logstash with a filename based on the filename patterns being watched (i.e. the path option). Thus, changing the filename patterns will result in a new sincedb file being used and any existing current position state will be lost. If you change your patterns with any frequency it might make sense to explicitly choose a sincedb path with the sincedb_path option.

Sincedb files are text files with four columns:

  1. The inode number (or equivalent).
  2. The major device number of the file system (or equivalent).
  3. The minor device number of the file system (or equivalent).
  4. The current byte offset within the file.

On non-Windows systems you can obtain the inode number of a file with e.g. ls -li.

File rotation

edit

File rotation is detected and handled by this input, regardless of whether the file is rotated via a rename or a copy operation. To support programs that write to the rotated file for some time after the rotation has taken place, include both the original filename and the rotated filename (e.g. /var/log/syslog and /var/log/syslog.1) in the filename patterns to watch (the path option). Note that the rotated filename will be treated as a new file so if start_position is set to beginning the rotated file will be reprocessed.

With the default value of start_position (end) any messages written to the end of the file between the last read operation prior to the rotation and its reopening under the new name (an interval determined by the stat_interval and discover_interval options) will not get picked up.

 

Synopsis

edit

This plugin supports the following configuration options:

Required configuration options:

file {
    path => ...
}

Available configuration options:

Setting Input type Required Default value

add_field

hash

No

{}

codec

codec

No

"plain"

delimiter

string

No

"\n"

discover_interval

number

No

15

exclude

array

No

path

array

Yes

sincedb_path

string

No

sincedb_write_interval

number

No

15

start_position

string, one of ["beginning", "end"]

No

"end"

stat_interval

number

No

1

tags

array

No

type

string

No

Details

edit

 

add_field

edit
  • Value type is hash
  • Default value is {}

Add a field to an event

charset (DEPRECATED)

edit
  • DEPRECATED WARNING: This configuration item is deprecated and may not be available in future versions. <li> Value type is string
  • There is no default value for this setting.

The character encoding used in this input. Examples include UTF-8 and cp1252

This setting is useful if your log files are in Latin-1 (aka cp1252) or in another character set other than UTF-8.

This only affects plain format logs since json is UTF-8 already.

codec

edit
  • Value type is codec
  • Default value is "plain"

The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.

debug (DEPRECATED)

edit
  • DEPRECATED WARNING: This configuration item is deprecated and may not be available in future versions.
  • Value type is boolean
  • Default value is false

delimiter

edit
  • Value type is string
  • Default value is "\n"

set the new line delimiter, defaults to "\n"

discover_interval

edit
  • Value type is number
  • Default value is 15

How often (in seconds) we expand globs to discover new files to watch.

exclude

edit
  • Value type is array
  • There is no default value for this setting.

Exclusions (matched against the filename, not full path). Globs are valid here, too. For example, if you have

    path => "/var/log/*"

You might want to exclude gzipped files:

    exclude => "*.gz"

format (DEPRECATED)

edit
  • DEPRECATED WARNING: This configuration item is deprecated and may not be available in future versions.
  • Value can be any of: plain, json, json_event, msgpack_event
  • There is no default value for this setting.

The format of input data (plain, json, json_event)

message_format (DEPRECATED)

edit
  • DEPRECATED WARNING: This configuration item is deprecated and may not be available in future versions.
  • Value type is string
  • There is no default value for this setting.

If format is json, an event sprintf string to build what the display @message should be given (defaults to the raw JSON). sprintf format strings look like %{fieldname}

If format is json_event, ALL fields except for @type are expected to be present. Not receiving all fields will cause unexpected results.

path

edit
  • This is a required setting.
  • Value type is array
  • There is no default value for this setting.

TODO(sissel): This should switch to use the line codec by default once file following The path(s) to the file(s) to use as an input. You can use globs here, such as /var/log/*.log Paths must be absolute and cannot be relative.

You may also configure multiple paths. See an example on the Logstash configuration page.

sincedb_path

edit
  • Value type is string
  • There is no default value for this setting.

Path of the sincedb database file (keeps track of the current position of monitored log files) that will be written to disk. The default will write sincedb files to some path matching $HOME/.sincedb* NOTE: it must be a file path and not a directory path

sincedb_write_interval

edit
  • Value type is number
  • Default value is 15

How often (in seconds) to write a since database with the current position of monitored log files.

start_position

edit
  • Value can be any of: beginning, end
  • Default value is "end"

Choose where Logstash starts initially reading files: at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning

This option only modifies "first contact" situations where a file is new and not seen before. If a file has already been seen before, this option has no effect.

stat_interval

edit
  • Value type is number
  • Default value is 1

How often (in seconds) we stat files to see if they have been modified. Increasing this interval will decrease the number of system calls we make, but increase the time to detect new log lines.

tags

edit
  • Value type is array
  • There is no default value for this setting.

Add any number of arbitrary tags to your event.

This can help with processing later.

type

edit
  • Value type is string
  • There is no default value for this setting.

Add a type field to all events handled by this input.

Types are used mainly for filter activation.

The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.

If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.