HTTP JSON input
editHTTP JSON input
editThis functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
Use the httpjson
input to read messages from an HTTP API with JSON payloads.
For example, this input is used to retrieve MISP threat indicators in the Filebeat MISP module.
This input supports retrieval at a configurable interval and pagination.
Example configurations:
filebeat.inputs: # Fetch your public IP every minute. - type: httpjson url: https://api.ipify.org/?format=json interval: 1m processors: - decode_json_fields fields: [message] target: json
filebeat.inputs: - type: httpjson url: http://localhost:9200/_search?scroll=5m http_method: POST json_objects_array: hits.hits pagination: extra_body_content: scroll: 5m id_field: _scroll_id req_field: scroll_id url: http://localhost:9200/_search/scroll
Additionally, it supports authentication via HTTP Headers, API key or oauth2.
Example configurations with authentication:
filebeat.inputs: - type: httpjson http_headers: Authorization: 'Basic aGVsbG86d29ybGQ=' url: http://localhost
filebeat.inputs: - type: httpjson oauth2: client.id: 12345678901234567890abcdef client.secret: abcdef12345678901234567890 token_url: http://localhost/oauth2/token url: http://localhost
Configuration options
editThe httpjson
input supports the following configuration options plus the
Common options described later.
api_key
editAPI key to access the HTTP API. When set, this adds an Authorization
header to
the HTTP request with this as the value.
http_client_timeout
editDuration before declaring that the HTTP client connection has timed out.
Defaults to 60s
. Valid time units are ns
, us
, ms
, s
(default), m
,
h
.
http_headers
editAdditional HTTP headers to set in the requests. The default value is null
(no additional headers).
- type: httpjson http_headers: Authorization: 'Basic aGVsbG86d29ybGQ='
http_method
editHTTP method to use when making requests. GET
or POST
are the options.
Defaults to GET
.
http_request_body
editAn optional HTTP POST body. The configuration value must be an object, and it
will be encoded to JSON. This is only valid when http_method
is POST
.
Defaults to null
(no HTTP body).
- type: httpjson http_method: POST http_request_body: query: bool: filter: term: type: authentication
interval
editDuration between repeated requests. By default, the interval is 0
which means
it performs a single request then stops. It may make additional pagination
requests in response to the initial request if pagination is enabled.
json_objects_array
editIf the response body contains a JSON object containing an array then this option
specifies the key containing that array. Each object in that array will generate
an event. This example response contains an array called events
that we want
to index.
{ "time": "2020-06-02 23:22:32 UTC", "events": [ { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" }, "user": { "name": "fflintstone" } }, { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" }, "user": { "name": "brubble" } } ] }
The config needs to specify events
as the json_objects_array
value.
- type: httpjson json_objects_array: events
split_events_by
editIf the response body contains a JSON object containing an array then this option specifies the key containing that array. Each object in that array will generate an event, but will maintain the common fields of the document as well.
{ "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": [ { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" } }, { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" } } ] }
The config needs to specify events
as the split_events_by
value.
- type: httpjson split_events_by: events
And will output the following events:
[ { "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": { "timestamp": "2020-05-02 11:10:03 UTC", "event": { "category": "authorization" } } }, { "time": "2020-06-02 23:22:32 UTC", "user": "Bob", "events": { "timestamp": "2020-05-05 13:03:11 UTC", "event": { "category": "authorization" } } } ]
It can be used in combination with json_objects_array
, which will look for the field inside each element.
no_http_body
editForce HTTP requests to be sent with an empty HTTP body. Defaults to false
.
This option cannot be used with http_request_body
,
pagination.extra_body_content
, or pagination.req_field
.
pagination.enabled
editThe enabled
setting can be used to disable the pagination configuration by
setting it to false
. The default value is true
.
Pagination settings are disabled if either enabled
is set to false
or
the pagination
section is missing.
pagination.extra_body_content
editAn object containing additional fields that should be included in the pagination
request body. Defaults to null
.
- type: httpjson pagination.extra_body_content: max_items: 500
pagination.header.field_name
editThe name of the HTTP header in the response that is used for pagination control.
The header value will be extracted from the response and used to make the next
pagination response. pagination.header.regex_pattern
can be used to select
a subset of the value.
pagination.header.regex_pattern
editThe regular expression pattern to use for retrieving the pagination information from the HTTP header field specified above. The first match becomes as the value.
pagination.id_field
editThe name of a field in the JSON response body to use as the pagination ID.
The value will be included in the next pagination request under the key
specified by the pagination.req_field
value.
pagination.req_field
editThe name of the field to include in the pagination JSON request body containing
the pagination ID defined by the pagination.id_field
field.
pagination.url
editThis specifies the URL for sending pagination requests. Defaults to the url
value. This is only needed when the pagination requests need to be routed to
a different URL.
rate_limit.limit
editThis specifies the field in the HTTP header of the response that specifies the total limit.
rate_limit.remaining
editThis specifies the field in the HTTP header of the response that specifies the remaining quota of the rate limit.
rate_limit.reset
editThis specifies the field in the HTTP Header of the response that specifies the epoch time when the rate limit will reset.
retry.max_attempts
editThis specifies the maximum number of retries for the retryable HTTP client. Default: 5.
retry.wait_min
editThis specifies the minimum time to wait before a retry is attempted. Default: 1s.
retry.wait_max
editThis specifies the maximum time to wait before a retry is attempted. Default: 60s.
ssl
editThis specifies SSL/TLS configuration. If the ssl section is missing, the host’s CAs are used for HTTPS connections. See SSL for more information.
url
editThe URL of the HTTP API. Required.
oauth2.enabled
editThe enabled
setting can be used to disable the oauth2 configuration by
setting it to false
. The default value is true
.
OAuth2 settings are disabled if either enabled
is set to false
or
the oauth2
section is missing.
oauth2.provider
editThe provider
setting can be used to configure supported oauth2 providers.
Each supported provider will require specific settings. It is not set by default.
Supported providers are: azure
, google
.
oauth2.client.id
editThe client.id
setting is used as part of the authentication flow. It is always required
except if using google
as provider. Required for providers: default
, azure
.
oauth2.client.secret
editThe client.secret
setting is used as part of the authentication flow. It is always required
except if using google
as provider. Required for providers: default
, azure
.
oauth2.scopes
editThe scopes
setting defines a list of scopes that will be requested during the oauth2 flow.
It is optional for all providers.
oauth2.token_url
editThe token_url
setting specifies the endpoint that will be used to generate the
tokens during the oauth2 flow. It is required if no provider is specified.
For azure
provider either token_url
or azure.tenant_id
is required.
oauth2.endpoint_params
editThe endpoint_params
setting specifies a set of values that will be sent on each
request to the token_url
. Each param key can have multiple values.
Can be set for all providers except google
.
- type: httpjson oauth2: endpoint_params: Param1: - ValueA - ValueB Param2: - Value
oauth2.azure.tenant_id
editThe azure.tenant_id
is used for authentication when using azure
provider.
Since it is used in the process to generate the token_url
, it can’t be used in
combination with it. It is not required.
For information about where to find it, you can refer to https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal.
oauth2.azure.resource
editThe azure.resource
is used to identify the accessed WebAPI resource when using azure
provider.
It is not required.
oauth2.google.credentials_file
editThe google.credentials_file
setting specifies the credentials file for Google.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
oauth2.google.credentials_json
editThe google.credentials_json
setting allows to write your credentials information as raw JSON.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
oauth2.google.jwt_file
editThe google.jwt_file
setting specifies the JWT Account Key file for Google.
Only one of the credentials settings can be set at once. If none is provided, loading default credentials from the environment will be attempted via ADC. For more information about how to provide Google credentials, please refer to https://cloud.google.com/docs/authentication.
Common options
editThe following configuration options are supported by all inputs.
enabled
editUse the enabled
option to enable and disable inputs. By default, enabled is
set to true.
tags
editA list of tags that Filebeat includes in the tags
field of each published
event. Tags make it easy to select specific events in Kibana or apply
conditional filtering in Logstash. These tags will be appended to the list of
tags specified in the general configuration.
Example:
filebeat.inputs: - type: httpjson . . . tags: ["json"]
fields
editOptional fields that you can specify to add additional information to the
output. For example, you might add fields that you can use for filtering log
data. Fields can be scalar values, arrays, dictionaries, or any nested
combination of these. By default, the fields that you specify here will be
grouped under a fields
sub-dictionary in the output document. To store the
custom fields as top-level fields, set the fields_under_root
option to true.
If a duplicate field is declared in the general configuration, then its value
will be overwritten by the value declared here.
filebeat.inputs: - type: httpjson . . . fields: app_id: query_engine_12
fields_under_root
editIf this option is set to true, the custom
fields are stored as top-level fields in
the output document instead of being grouped under a fields
sub-dictionary. If
the custom field names conflict with other field names added by Filebeat,
then the custom fields overwrite the other fields.
processors
editA list of processors to apply to the input data.
See Processors for information about specifying processors in your config.
pipeline
editThe Ingest Node pipeline ID to set for the events generated by this input.
The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.
keep_null
editIf this option is set to true, fields with null
values will be published in
the output document. By default, keep_null
is set to false
.
index
editIf present, this formatted string overrides the index for events from this input
(for elasticsearch outputs), or sets the raw_index
field of the event’s
metadata (for other outputs). This string can only refer to the agent name and
version and the event timestamp; for access to dynamic fields, use
output.elasticsearch.index
or a processor.
Example value: "%{[agent.name]}-myindex-%{+yyyy.MM.dd}"
might
expand to "filebeat-myindex-2019.11.01"
.
publisher_pipeline.disable_host
editBy default, all events contain host.name
. This option can be set to true
to
disable the addition of this field to all events. The default value is false
.