Create a transform

PUT /_transform/{transform_id}

Creates a transform.

A transform copies data from source indices, transforms it, and persists it into an entity-centric destination index. You can also think of the destination index as a two-dimensional tabular data structure (known as a data frame). The ID for each document in the data frame is generated from a hash of the entity, so there is a unique row per entity.

You must choose either the latest or pivot method for your transform; you cannot use both in a single transform. If you choose to use the pivot method for your transform, the entities are defined by the set of group_by fields in the pivot object. If you choose to use the latest method, the entities are defined by the unique_key field values in the latest object.

You must have create_index, index, and read privileges on the destination index and read and view_index_metadata privileges on the source indices. When Elasticsearch security features are enabled, the transform remembers which roles the user that created it had at the time of creation and uses those same roles. If those roles do not have the required privileges on the source and destination indices, the transform fails when it attempts unauthorized operations.

NOTE: You must use Kibana or this API to create a transform. Do not add a transform directly into any .transform-internal* indices using the Elasticsearch index API. If Elasticsearch security features are enabled, do not give users any privileges on .transform-internal* indices. If you used transforms prior to 7.5, also do not give users any privileges on .data-frame-internal* indices.

Path parameters

  • transform_id string Required

    Identifier for the transform. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It has a 64 character limit and must start and end with alphanumeric characters.

Query parameters

  • When the transform is created, a series of validations occur to ensure its success. For example, there is a check for the existence of the source indices and a check that the destination index is not part of the source index pattern. You can use this parameter to skip the checks, for example when the source index does not exist until after the transform is created. The validations are always run when you start the transform, however, with the exception of privilege checks.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

  • dest object Required
    Hide dest attributes Show dest attributes object
  • Free text description of the transform.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • latest object
    Hide latest attributes Show latest attributes object
    • sort string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • unique_key array[string] Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • _meta object
    Hide _meta attributes Show _meta attributes object
  • pivot object
    Hide pivot attributes Show pivot attributes object
  • Hide retention_policy attribute Show retention_policy attribute object
    • time object
      Hide time attributes Show time attributes object
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • max_age string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • settings object
    Hide settings attributes Show settings attributes object
    • Specifies whether the transform checkpoint ranges should be optimized for performance. Such optimization can align checkpoint ranges with the date histogram interval when date histogram is specified as a group source in the transform config. As a result, less document updates in the destination index will be performed thus improving overall performance.

    • Defines if dates in the ouput should be written as ISO formatted string or as millis since epoch. epoch_millis was the default for transforms created before version 7.11. For compatible output set this value to true.

    • Specifies whether the transform should deduce the destination index mappings from the transform configuration.

    • Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.

    • Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536.

    • unattended boolean

      If true, the transform runs in unattended mode. In unattended mode, the transform retries indefinitely in case of an error which means the transform never fails. Setting the number of retries other than infinite fails in validation.

  • source object Required
    Hide source attributes Show source attributes object
    • index string | array[string] Required
    • query object
      Hide query attributes Show query attributes object
      • bool object
        Hide bool attributes Show bool attributes object
      • boosting object
        Hide boosting attributes Show boosting attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • negative_boost number Required

          Floating point number between 0 and 1.0 used to decrease the relevance scores of documents matching the negative query.

        • negative object Required
        • positive object Required
      • common object Deprecated
      • Hide combined_fields attributes Show combined_fields attributes object
      • Hide constant_score attributes Show constant_score attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • filter object Required
      • dis_max object
        Hide dis_max attributes Show dis_max attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • queries array[object] Required

          One or more query clauses. Returned documents must match one or more of these queries. If a document matches multiple queries, Elasticsearch uses the highest relevance score.

        • Floating point number between 0 and 1.0 used to increase the relevance scores of documents matching multiple query clauses.

      • distance_feature object

        One of:
      • exists object
        Hide exists attributes Show exists attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide function_score attributes Show function_score attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • Values are multiply, replace, sum, avg, max, or min.

        • functions array[object]

          One or more functions that compute a new score for each document returned by the query.

        • Restricts the new score to not exceed the provided limit.

        • Excludes documents that do not meet the provided score threshold.

        • query object
        • Values are multiply, sum, avg, first, max, or min.

      • fuzzy object

        Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

      • Hide geo_bounding_box attributes Show geo_bounding_box attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • type string

          Values are memory or indexed.

        • Values are coerce, ignore_malformed, or strict.

        • Set to true to ignore an unmapped field and not match any documents for this query. Set to false to throw an exception if the field is not mapped.

      • Hide geo_distance attributes Show geo_distance attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • distance string Required
        • Values are arc or plane.

        • Values are coerce, ignore_malformed, or strict.

        • Set to true to ignore an unmapped field and not match any documents for this query. Set to false to throw an exception if the field is not mapped.

      • Hide geo_polygon attributes Show geo_polygon attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • Values are coerce, ignore_malformed, or strict.

      • Hide geo_shape attributes Show geo_shape attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • Set to true to ignore an unmapped field and not match any documents for this query. Set to false to throw an exception if the field is not mapped.

      • Hide has_child attributes Show has_child attributes object
      • Hide has_parent attributes Show has_parent attributes object
      • ids object
        Hide ids attributes Show ids attributes object
      • Returns documents based on the order and proximity of matching terms.

      • knn object
        Hide knn attributes Show knn attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • query_vector array[number]
        • Hide query_vector_builder attribute Show query_vector_builder attribute object
        • The number of nearest neighbor candidates to consider per shard

        • k number

          The final number of nearest neighbors to return as top hits

        • filter object | array[object]

          Filters for the kNN search query

        • The minimum similarity for a vector to be considered a match

      • match object

        Returns documents that match a provided text, number, date or boolean value. The provided text is analyzed before matching.

      • Hide match_all attributes Show match_all attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
      • Analyzes its input and constructs a bool query from the terms. Each term except the last is used in a term query. The last term is used in a prefix query.

      • Hide match_none attributes Show match_none attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
      • Analyzes the text and creates a phrase query out of the analyzed text.

      • Returns documents that contain the words of a provided text, in the same order as provided. The last term of the provided text is treated as a prefix, matching any words that begin with that term.

      • Hide more_like_this attributes Show more_like_this attributes object
      • Hide multi_match attributes Show multi_match attributes object
      • nested object
        Hide nested attributes Show nested attributes object
      • Hide parent_id attributes Show parent_id attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • id string
        • Indicates whether to ignore an unmapped type and not return any documents instead of an error.

        • type string
      • Hide percolate attributes Show percolate attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • document object

          The source of the document being percolated.

        • documents array[object]

          An array of sources of the documents being percolated.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • id string
        • index string
        • name string

          The suffix used for the _percolator_document_slot field when multiple percolate queries are specified.

        • Preference used to fetch document to percolate.

        • routing string
        • version number
      • pinned object
        Hide pinned attributes Show pinned attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • organic object Required
        • ids array[string]

          Document IDs listed in the order they are to appear in results. Required if docs is not specified.

        • docs array[object]

          Documents listed in the order they are to appear in results. Required if ids is not specified.

      • prefix object

        Returns documents that contain a specific prefix in a provided field.

      • Hide query_string attributes Show query_string attributes object
      • range object

        Returns documents that contain terms within a provided range.

      • Hide rank_feature attributes Show rank_feature attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide saturation attribute Show saturation attribute object
          • pivot number

            Configurable pivot value so that the result will be less than 0.5.

        • log object
          Hide log attribute Show log attribute object
        • linear object
        • sigmoid object
          Hide sigmoid attributes Show sigmoid attributes object
          • pivot number Required

            Configurable pivot value so that the result will be less than 0.5.

          • exponent number Required

            Configurable Exponent.

      • regexp object

        Returns documents that contain terms matching a regular expression.

      • rule object
        Hide rule attributes Show rule attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • organic object Required
        • ruleset_ids array[string] Required
        • match_criteria object Required
      • script object
        Hide script attributes Show script attributes object
      • Hide script_score attributes Show script_score attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • Documents with a score lower than this floating point number are excluded from the search results.

        • query object Required
        • script object Required
          Hide script attributes Show script attributes object
      • semantic object
        Hide semantic attributes Show semantic attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • field string Required

          The field to query, which must be a semantic_text field type

        • query string Required

          The query text

      • shape object
        Hide shape attributes Show shape attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • When set to true the query ignores an unmapped field and will not match any documents.

      • Hide simple_query_string attributes Show simple_query_string attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • analyzer string

          Analyzer used to convert text in the query string into tokens.

        • If true, the query attempts to analyze wildcard terms in the query string.

        • If true, the parser creates a match_phrase query for each multi-position token.

        • Values are and, AND, or, or OR.

        • fields array[string]

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • flags string

          Query flags can be either a single flag or a combination of flags, e.g. OR|AND|PREFIX

          One of:

          Query flags can be either a single flag or a combination of flags, e.g. OR|AND|PREFIX

          Values are NONE, AND, NOT, OR, PREFIX, PHRASE, PRECEDENCE, ESCAPE, WHITESPACE, FUZZY, NEAR, SLOP, or ALL.

          Query flags can be either a single flag or a combination of flags, e.g. OR|AND|PREFIX

        • Maximum number of terms to which the query expands for fuzzy matching.

        • Number of beginning characters left unchanged for fuzzy matching.

        • If true, edits for fuzzy matching include transpositions of two adjacent characters (for example, ab to ba).

        • lenient boolean

          If true, format-based errors, such as providing a text value for a numeric field, are ignored.

        • minimum_should_match number | string

          The minimum number of terms that should match as integer, percentage or range

        • query string Required

          Query string in the simple query string syntax you wish to parse and use for search.

        • Suffix appended to quoted text in the query string.

      • Hide span_containing attributes Show span_containing attributes object
      • Hide span_field_masking attributes Show span_field_masking attributes object
      • Hide span_first attributes Show span_first attributes object
      • Hide span_multi attributes Show span_multi attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • match object Required
      • Hide span_near attributes Show span_near attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • clauses array[object] Required

          Array of one or more other span type queries.

          Hide clauses attributes Show clauses attributes object
        • in_order boolean

          Controls whether matches are required to be in-order.

        • slop number

          Controls the maximum number of intervening unmatched positions permitted.

      • span_not object
        Hide span_not attributes Show span_not attributes object
      • span_or object
        Hide span_or attributes Show span_or attributes object
      • Matches spans containing a term.

      • Hide span_within attributes Show span_within attributes object
      • Hide sparse_vector attributes Show sparse_vector attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • query string

          The query text you want to use for search. If inference_id is specified, query must also be specified.

        • prune boolean

          Whether to perform pruning, omitting the non-significant tokens from the query to improve query performance. If prune is true but the pruning_config is not specified, pruning will occur but default values will be used. Default: false

        • Hide pruning_config attributes Show pruning_config attributes object
          • Tokens whose frequency is more than this threshold times the average frequency of all tokens in the specified field are considered outliers and pruned.

          • Tokens whose weight is less than this threshold are considered nonsignificant and pruned.

          • Whether to only score pruned tokens, vs only scoring kept tokens.

        • Dictionary of precomputed sparse vectors and their associated weights. Only one of inference_id or query_vector may be supplied in a request.

          Hide query_vector attributes Show query_vector attributes object
      • term object

        Returns documents that contain an exact term in a provided field. To return a document, the query term must exactly match the queried field's value, including whitespace and capitalization.

      • terms object
        Hide terms attributes Show terms attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
      • Returns documents that contain a minimum number of exact terms in a provided field. To return a document, a required number of terms must exactly match the field values, including whitespace and capitalization.

      • text_expansion object Deprecated

        Uses a natural language processing model to convert the query text into a list of token-weight pairs which are then used in a query against a sparse vector or rank features field.

      • weighted_tokens object Deprecated

        Supports returning text_expansion query results by sending in precomputed tokens with the query.

      • wildcard object

        Returns documents that contain terms matching a wildcard pattern.

      • wrapper object
        Hide wrapper attributes Show wrapper attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • query string Required

          A base64 encoded query. The binary data format can be any of JSON, YAML, CBOR or SMILE encodings

      • type object
        Hide type attributes Show type attributes object
        • boost number

          Floating point number used to decrease or increase the relevance scores of the query. Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

        • _name string
        • value string Required
    • Hide runtime_mappings attributes Show runtime_mappings attributes object
  • sync object
    Hide sync attribute Show sync attribute object
    • time object
      Hide time attributes Show time attributes object
      • delay string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_transform/{transform_id}
curl \
 -X PUT http://api.example.com/_transform/{transform_id} \
 -H "Content-Type: application/json" \
 -d '{"dest":{"index":"kibana_sample_data_ecommerce_transform1","pipeline":"add_timestamp_pipeline"},"sync":{"time":{"delay":"60s","field":"order_date"}},"pivot":{"group_by":{"customer_id":{"terms":{"field":"customer_id","missing_bucket":true}}},"aggregations":{"max_price":{"max":{"field":"taxful_total_price"}}}},"source":{"index":"kibana_sample_data_ecommerce","query":{"term":{"geoip.continent_name":{"value":"Asia"}}}},"frequency":"5m","description":"Maximum priced ecommerce data by customer_id in Asia","retention_policy":{"time":{"field":"order_date","max_age":"30d"}}}'
{
  "dest": {
    "index": "kibana_sample_data_ecommerce_transform1",
    "pipeline": "add_timestamp_pipeline"
  },
  "sync": {
    "time": {
      "delay": "60s",
      "field": "order_date"
    }
  },
  "pivot": {
    "group_by": {
      "customer_id": {
        "terms": {
          "field": "customer_id",
          "missing_bucket": true
        }
      }
    },
    "aggregations": {
      "max_price": {
        "max": {
          "field": "taxful_total_price"
        }
      }
    }
  },
  "source": {
    "index": "kibana_sample_data_ecommerce",
    "query": {
      "term": {
        "geoip.continent_name": {
          "value": "Asia"
        }
      }
    }
  },
  "frequency": "5m",
  "description": "Maximum priced ecommerce data by customer_id in Asia",
  "retention_policy": {
    "time": {
      "field": "order_date",
      "max_age": "30d"
    }
  }
}
{
  "dest": {
    "index": "kibana_sample_data_ecommerce_transform2"
  },
  "sync": {
    "time": {
      "delay": "60s",
      "field": "order_date"
    }
  },
  "latest": {
    "sort": "order_date",
    "unique_key": [
      "customer_id"
    ]
  },
  "source": {
    "index": "kibana_sample_data_ecommerce"
  },
  "frequency": "5m",
  "description": "Latest order for each customer"
}
Response examples (200)
{
  "acknowledged": true
}