vigyasharma commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2466073115
I tried to find some blogs and benchmarks on other library implementations. Astra Db, Vespa, faiss and nmslib, all seem to support multi-vectors in some form. From what I can tell, Astra Db and Vespa have ColBERT style multi-vector support in ANN [[1]](https://docs.datastax.com/en/ragstack/examples/colbert.html) [[2]](https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/). Benchmarks indicate ColBERT outperforms other techniques in quality, but full ColBERT on ANN has higher latency [[3]](https://thenewstack.io/overcoming-the-limits-of-rag-with-colbert/). For large scale applications, users seem to overquery on ANN with single vector representations, and rerank them with ColBERT token vectors [[4]](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/). However, there's also ongoing work/research on reducing the no. of embeddings in ColBERT, like PLAID which replaces a bunch of vectors with their centroids [[5]](https://arxiv.org/abs/2205.09707). ... > I am honestly torn around whats the best path forward for the majority of users in Lucene. I hear you! And I don't want to add complexity only because we have some body of work in this PR. Thanks for raising the concern Jim, it led me to some interesting reading. ... My current thinking is that this is a rapidly evolving field, and it's early to lean one way or another. Adding this support unlocks experimentation. We might add different, scalable, ANN algos going forward, and our flat storage format should work with most of them. Meanwhile, there's research on different ways to run late interaction with multiple but fewer vectors. This change will help users experiment with what works at their scale, for their cost/performance/quality requirements. I'm happy to change my perspective, and would like to hear more opinions. One reason to not add this would be if it makes the single vector setup hard to evolve. I'd like to understand if (and how) this is happening, and think on how we can address those concerns. ... 1: https://docs.datastax.com/en/ragstack/examples/colbert.html 2: https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/ 3: https://thenewstack.io/overcoming-the-limits-of-rag-with-colbert/ 4: https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/ 5: PLAID - https://arxiv.org/abs/2205.09707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org