benwtrent opened a new pull request, #12582: URL: https://github.com/apache/lucene/pull/12582
As with most codec changes, this is an eye popping number of LoC and the design isn't finished yet. I am opening this as draft to be open about the work and to discuss further direction. Initial benchmarking (utilizing non-normalized cohere embeddings + max-inner product, which is a particularly difficult case for naive quantization), I get 10-20% faster search, 2x faster index building, and ~4x smaller storage used for the search (I am keeping the raw vectors around...we can debate if we want to do that). Recall@10 with 100 fanout = 0.804 Recall@100 with 200 fanout = 0.9. I am reaching the point where the design needs to be finalized and I wanted to reachout for feedback. Some design discussion points that I am unsure about are: - Do we want to have a new "flat" vector codec that HNSW (or other complicated vector indexing methods), can use? Detractor here is that now HNSW codec relies on another pluggable thing that is a "flat" vector index (just provides mechanisms for reading, writing, merging vectors in a flat index). - Should "quantization" just be a thing that is provided to vector codecs? The main detractor here is future scalar quantization could easily be added (like int4 or even binary). - Should the "quantizer" keep the raw vectors around itself? Or rely on some external party to provide them (in this case, I an relying on the HNSW codec)? Again, this is draft, I have a ton of comments to fix up, etc. But wanted early feedback and what we want to integrate into Lucene. As a side note, it really seems some of these classes (OffHeap...vectorReader...) should be common between all the vector codecs instead of copied around, its a ton of code that gets copied with almost no change between codecs :/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org