« Built-in data filters Delete sensitive data »

› › › › ›

Custom filters

Custom filters, including ingest pipeline filters and APM agent filters, allow you to filter or redact APM data on ingestion.

Ingest pipeline filters

edit

Ingest pipelines specify a series of processors that transform data in a specific way. Transformation happens prior to indexing—inflicting no performance overhead on the monitored application. Pipelines are a flexible and easy way to filter or obfuscate Elastic APM data.

Features of this approach:

Filters are applied at ingestion time.
All Elastic APM agents and fields are supported.
Data leaves the instrumented service.
There are no performance overhead implications on the instrumented service.

For a step-by-step example, refer to Tutorial: Use an ingest pipeline to redact sensitive information.

APM agent filters

edit

Some APM agents offer a way to manipulate or drop APM events before they are sent to APM Server.

Features of this approach:

Data is sanitized before leaving the instrumented service.
Not supported by all Elastic APM agents.
Potential overhead implications on the instrumented service.

Refer to the relevant agent’s documentation for more information and examples:

.NET: Filter API.
Node.js: addFilter().
Python: custom processors.
Ruby: add_filter().

Tutorial: Use an ingest pipeline to redact sensitive information

edit

Say you decide to capture HTTP request bodies but quickly notice that sensitive information is being collected in the http.request.body.original field:

{
  "email": "test@abc.com",
  "password": "hunter2"
}

To obfuscate the passwords stored in the request body, you can use a series of ingest processors.

Create a pipeline

edit

This tutorial uses the Ingest APIs, but it’s also possible to create a pipeline using the UI. In Kibana, go to Stack Management → Ingest Pipelines → Create pipeline → New pipeline or use the global search field.

To start, create a pipeline with a simple description and an empty array of processors:

{
  "pipeline": {
    "description": "redact http.request.body.original.password",
    "processors": [] 
  }
}

The processors defined below will go in this array

Add a JSON processor

edit

Add your first processor to the processors array. Because the agent captures the request body as a string, use the JSON processor to convert the original field value into a structured JSON object. Save this JSON object in a new field:

{
  "json": {
    "field": "http.request.body.original",
    "target_field": "http.request.body.original_json",
    "ignore_failure": true
  }
}

Add a set processor

edit

If body.original_json is not null, i.e., it exists, we’ll redact the password with the set processor, by setting the value of body.original_json.password to "redacted":

{
  "set": {
    "field": "http.request.body.original_json.password",
    "value": "redacted",
    "if": "ctx?.http?.request?.body?.original_json != null"
  }
}

Add a convert processor

edit

Use the convert processor to convert the JSON value of body.original_json to a string and set it as the body.original value:

{
  "convert": {
    "field": "http.request.body.original_json",
    "target_field": "http.request.body.original",
    "type": "string",
    "if": "ctx?.http?.request?.body?.original_json != null",
    "ignore_failure": true
  }
}

Add a remove processor

edit

Finally, use the remove processor to remove the body.original_json field:

{
  "remove": {
    "field": "http.request.body.original_json",
    "if": "ctx?.http?.request?.body?.original_json != null",
    "ignore_failure": true
  }
}

Register the pipeline

edit

Then put it all together, and use the create or update pipeline API to register the new pipeline in Elasticsearch. Name the pipeline apm_redacted_body_password:

PUT _ingest/pipeline/apm_redacted_body_password
{
  "description": "redact http.request.body.original.password",
  "processors": [
    {
      "json": {
        "field": "http.request.body.original",
        "target_field": "http.request.body.original_json",
        "ignore_failure": true
      }
    },
    {
      "set": {
        "field": "http.request.body.original_json.password",
        "value": "redacted",
        "if": "ctx?.http?.request?.body?.original_json != null"
      }
    },
    {
      "convert": {
        "field": "http.request.body.original_json",
        "target_field": "http.request.body.original",
        "type": "string",
        "if": "ctx?.http?.request?.body?.original_json != null",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "http.request.body.original_json",
        "if": "ctx?.http?.request?.body?.original_json != null",
        "ignore_failure": true
      }
    }
  ]
}

Copy as curl Try in Elastic

Test the pipeline

edit

Prior to enabling this new pipeline, you can test it with the simulate pipeline API. This API allows you to run multiple documents through a pipeline to ensure it is working correctly.

The request below simulates running three different documents through the pipeline:

POST _ingest/pipeline/apm_redacted_body_password/_simulate
{
  "docs": [
    {
      "_source": { 
        "http": {
          "request": {
            "body": {
              "original": """{"email": "test@abc.com", "password": "hunter2"}"""
            }
          }
        }
      }
    },
    {
      "_source": { 
        "some-other-field": true
      }
    },
    {
      "_source": { 
        "http": {
          "request": {
            "body": {
              "original": """["invalid json" """
            }
          }
        }
      }
    }
  ]
}

Copy as curl Try in Elastic

	This document features the same sensitive data from the original example above
	This document only contains an unrelated field
	This document contains invalid JSON

The API response should be similar to this:

{
  "docs" : [
    {
      "doc" : {
        "_source" : {
          "http" : {
            "request" : {
              "body" : {
                "original" : {
                  "password" : "redacted",
                  "email" : "test@abc.com"
                }
              }
            }
          }
        }
      }
    },
    {
      "doc" : {
        "_source" : {
          "nobody" : true
        }
      }
    },
    {
      "doc" : {
        "_source" : {
          "http" : {
            "request" : {
              "body" : {
                "original" : """["invalid json" """
              }
            }
          }
        }
      }
    }
  ]
}

As expected, only the first simulated document has a redacted password field. All other documents are unaffected.

Create a `@custom` pipeline

edit

The final step in this process is to call the newly created apm_redacted_body_password pipeline from the @custom pipeline of the data stream you wish to edit.

@custom pipelines are specific to each data stream and follow a similar naming convention: <type>-<dataset>@custom. As a reminder, the default APM data streams are:

Application traces: traces-apm-<namespace>
RUM and iOS agent application traces: traces-apm.rum-<namespace>
APM internal metrics: metrics-apm.internal-<namespace>
APM transaction metrics: metrics-apm.transaction.<metricset.interval>-<namespace>
APM service destination metrics: metrics-apm.service_destination.<metricset.interval>-<namespace>
APM service transaction metrics: metrics-apm.service_transaction.<metricset.interval>-<namespace>
APM service summary metrics: metrics-apm.service_summary.<metricset.interval>-<namespace>
Application metrics: metrics-apm.app.<service.name>-<namespace>
APM error/exception logging: logs-apm.error-<namespace>
Applications UI logging: logs-apm.app.<service.name>-<namespace>

To match a custom ingest pipeline with a data stream, follow the <type>-<dataset>@custom template, or replace -namespace with @custom in the table above. For example, to target application traces, you’d create a pipeline named traces-apm@custom.

Use the create or update pipeline API to register the new pipeline in Elasticsearch. Name the pipeline traces-apm@custom:

PUT _ingest/pipeline/traces-apm@custom
{
  "processors": [
    {
      "pipeline": {
        "name": "apm_redacted_body_password" 
      }
    }
  ]
}

Copy as curl Try in Elastic

The name of the pipeline we previously created

That’s it! Passwords will now be redacted from your APM HTTP body data.

Next steps

edit

To learn more about ingest pipelines, see View the Elasticsearch index template.

« Built-in data filters Delete sensitive data »

On this page

Ingest pipeline filters
APM agent filters
Tutorial: Use an ingest pipeline to redact sensitive information
Create a pipeline
Add a JSON processor
Add a set processor
Add a convert processor
Add a remove processor
Register the pipeline
Test the pipeline
Create a @custom pipeline
Next steps

Was this helpful?

Feedback

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

Custom filters

Custom filters

Ingest pipeline filters

APM agent filters

Tutorial: Use an ingest pipeline to redact sensitive information

Create a pipeline

Add a JSON processor

Add a set processor

Add a convert processor

Add a remove processor

Register the pipeline

Test the pipeline

Create a @custom pipeline

Next steps

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

Create a `@custom` pipeline