kevindrosendahl commented on issue #12615: URL: https://github.com/apache/lucene/issues/12615#issuecomment-1793186041
Hey @benwtrent and all, just wanted to let you know that I'm experimenting some with different index structures for larger than memory indexes. I have a working implementation of Vamana based off the existing HNSW implementation (with vectors colocated with the adjacency lists in storage) in [this branch](https://github.com/kevindrosendahl/lucene/tree/vamana2), which I'm currently working on integrating scalar quantization with, and a benchmarking framework [here](https://github.com/kevindrosendahl/java-ann-bench) which can run various Lucene and JVector configurations. I don't have many numbers to share yet besides the fact that the Vamana graph implementation in a single segment seems competitive while in memory with Lucene HNSW (single segment) and JVector on small data sets. For glove-100-angular from ann-benchmarks (k=10): ``` lucene_hnsw_maxConn:16-beamWidth:100_numCandidates:150 average recall 0.8227400000000001 average duration PT0.000968358S index size: 529M jvector_vamana_M:16-beamWidth:100-neighborOverflow:2.0-alpha:1.2_pqFactor:0-numCandidates:100 average recall 0.8242200000000001 average duration PT0.00124232S index size: 703M lucene_sandbox-vamana_maxConn:32-beamWidth:100-alpha:1.2-_numCandidates:100 average recall 0.82553 average duration PT0.000940756S index size: 554M ``` I plan on testing vamana without PQ, vamana with PQ (a la DiskANN), as well as SPANN. Happy to collaborate with anyone interested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org