Dumping an entire Elasticsearch index to csv with eland
01 Dec 2023With the 8.11.0 release of the Python package eland
it got a lot easier to dump an Elasticsearch index to csv.
All the code you need is this:
import eland
eland.DataFrame('http://localhost:9200', 'your-index').to_csv('your-csv.csv')
Before the 8.11.0 release, this method already existed in eland
, but it needed as much memory as the size of your dataset.
Now it streams the data batch-wise into a csv.
Most of the code to achieve this was written by @V1NAY8.
I only did some minor edits to get it to pass the pull request review.
While testing this eland
successfully dumped hundreds of gigabytes of data without any issues,
all without having to bother with scroll requests myself.