Tuning Elasticsearch for performant vector search

I noticed Elasticsearch was a bit slow when using cosineSimilarity on a dense_vector. Before thinking about it at all, I rather naively put all my data in only four indices. The largest of these indices were easily over 200 GB in size, mostly occupied by 512-float vectors. These vectors were already indexed for cosine similarity.

So what made the queries slow? In the background Elasticsearch runs on Lucene, which uses one thread per shard for queries. By default, Elasticsearch uses only one shard per index. This used to be five per index, before Elasticsearch 7. In my case this meant that my 128-core Elasticsearch cluster was using only 4 threads for a search!

Therefore, one of the simplest ways to use more shards (and cores) is to grow the number of indices. At query time this doesn’t have to be a problem, since you can use index patterns to search over multiple indices at once.

However, at some point adding more indices will stop helping. The results from each of the indices need to be merged by Elasticsearch, which will add overhead.

As a simple rule of thumb I would say get the number of cores in your cluster and split your data into that many indices, while keeping the size of indices somewhere between 1 and 10 GB. This maximizes the number of cores used for a search, while keeping the overhead of merging the results relatively small.

If your indices end up a lot smaller than 1 GB, you probably don’t need as many indices as cores. If your indices are still a lot larger than 10 GB, and your queries are not quick enough, you might want to increase the core count of your Elasticsearch cluster.

You can guess the size of your indices in advance by inserting a relative sample of documents. You can check the size of the index on disk (with curl localhost:9200/_cat/indices for example), divide the size by the number of documents, and multiply by the total number of documents you want to index. This gives an idea of the size of the index you will end up with.

To summarize: while Elasticsearch recommends index sizes between 10 and 50 GB, I found that the performance of vector search in particular was better when the indices were between 1 and 10 GB.