IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
Exported fields
editExported fields
editThe fields which might be extracted from a document are:
-
content
, -
title
, -
author
, -
keywords
, -
date
, -
content_type
, -
content_length
, -
language
, -
modified
, -
format
, -
identifier
, -
contributor
, -
coverage
, -
modifier
, -
creator_tool
, -
publisher
, -
relation
, -
rights
, -
source
, -
type
, -
description
, -
print_date
, -
metadata_date
, -
latitude
, -
longitude
, -
altitude
, -
rating
, -
comments
To extract only certain attachment
fields, specify the properties
array:
PUT _ingest/pipeline/attachment { "description" : "Extract attachment information", "processors" : [ { "attachment" : { "field" : "data", "properties": [ "content", "title" ] } } ] }
Extracting contents from binary data is a resource intensive operation and consumes a lot of resources. It is highly recommended to run pipelines using this processor in a dedicated ingest node.