IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
Limit the number of extracted chars
editLimit the number of extracted chars
editTo prevent extracting too many chars and overload the node memory, the number of chars being used for extraction
is limited by default to 100000
. You can change this value by setting indexed_chars
. Use -1
for no limit but
ensure when setting this that your node will have enough HEAP to extract the content of very big documents.
You can also define this limit per document by extracting from a given field the limit to set. If the document
has that field, it will overwrite the indexed_chars
setting. To set this field, define the indexed_chars_field
setting.
For example:
PUT _ingest/pipeline/attachment { "description" : "Extract attachment information", "processors" : [ { "attachment" : { "field" : "data", "indexed_chars" : 11, "indexed_chars_field" : "max_size" } } ] } PUT my_index/_doc/my_id?pipeline=attachment { "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=" } GET my_index/_doc/my_id
Returns this:
{ "found": true, "_index": "my_index", "_type": "_doc", "_id": "my_id", "_version": 1, "_seq_no": 35, "_primary_term": 1, "_source": { "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=", "attachment": { "content_type": "application/rtf", "language": "sl", "content": "Lorem ipsum", "content_length": 11 } } }
PUT _ingest/pipeline/attachment { "description" : "Extract attachment information", "processors" : [ { "attachment" : { "field" : "data", "indexed_chars" : 11, "indexed_chars_field" : "max_size" } } ] } PUT my_index/_doc/my_id_2?pipeline=attachment { "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=", "max_size": 5 } GET my_index/_doc/my_id_2
Returns this:
{ "found": true, "_index": "my_index", "_type": "_doc", "_id": "my_id_2", "_version": 1, "_seq_no": 40, "_primary_term": 1, "_source": { "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=", "max_size": 5, "attachment": { "content_type": "application/rtf", "language": "ro", "content": "Lorem", "content_length": 5 } } }