Troubleshooting CSV reports
editTroubleshooting CSV reports
editWe recommend using CSV reports to export moderate amounts of data only. The feature enables analysis of data in external tools, but it is not intended for bulk export or to backup Elasticsearch data. Report timeout and incomplete data issues are likely if you are exporting data where:
- More than 250 MB of data is being exported
- Data is stored on slow storage tiers
- Any shard needed for the search is unavailable
- Network latency between nodes is high
- Cross-cluster search is used
- ES|QL is used and result row count exceeds the limits of ES|QL queries
To work around the limitations, use filters to create multiple smaller reports, or extract the data you need directly with the Elasticsearch APIs.
For more information on using Elasticsearch APIs directly, see Scroll API, Point in time API, ES|QL or SQL with CSV response data format. We recommend that you use an official Elastic language client: details for each programming language library that Elastic provides are in the Elasticsearch Client documentation.
Reporting parameters can be adjusted to overcome some of these limiting scenarios. Results are dependent on data size, availability, and latency factors and are not guaranteed.
The CSV export feature in Kibana makes queries to Elasticsearch and formats the results into CSV. This feature offers a solution that attempts to provide the most benefit to the most use cases. However, things could go wrong during export. Elasticsearch can stop responding, repeated querying can take so long that authentication tokens can time out, and the format of exported data can be too complex for spreadsheet applications to handle. Such situations are outside of the control of Kibana. If the use case becomes complex enough, it’s recommended that you create scripts that query Elasticsearch directly, using a scripting language like Python and the public Elasticsearch APIs.
For advice about common problems, refer to Troubleshooting.
Configuring CSV export to use the scroll API
editThe Kibana CSV export feature collects all of the data from Elasticsearch by using multiple requests to page
over all of the documents.
Internally, the feature uses the Point in time API and
search_after
parameters in the queries to do so.
There are some limitations related to the point in time API:
- Permissions to read data aliases alone will not work: the permissions are needed on the underlying indices or data streams.
- In cases where data shards are unavailable or time out, the export will be empty rather than returning partial data.
Some users may benefit from using the scroll API, an alternative to paging through the data. The behavior of this API does not have the limitations of point in time API, however it has its own limitations:
- Search is limited to 500 shards at the very most.
- In cases where the data shards are unavailable or time out, the export may return partial data.
If you prefer the internal implementation of CSV export to use the scroll API, you can configure this in
kibana.yml
:
xpack.reporting.csv.scroll.strategy: scroll
For more details about CSV export settings, go to CSV settings.
Socket hangups
editA "socket hangup" is a generic type of error meaning that a remote service (in this case Elasticsearch or a proxy in Cloud) closed the connection. Kibana can’t foresee when this might happen and can’t force the remote service to keep the connection open. To work around this situation, consider lowering the size of results that come back in each request or increase the amount of time the remote services will allow to keep the request open. For example:
xpack.reporting.csv.scroll.size: 50 xpack.reporting.csv.scroll.duration: 2m
Such changes aren’t guaranteed to solve the issue, but give the functionality a better chance of working in this use case. Unfortunately, lowering the scroll size will require more requests to Elasticsearch during export, which adds more time overhead, which could unintentionally create more instances of auth token expiration errors.
Token expiration
editTo avoid token expirations, use a type of authentication that doesn’t expire (such as Basic auth) or run the export using scripts that query Elasticsearch directly. In a custom script, you have the ability to refresh the auth token as needed, such as once before each query.