kaivalnp commented on PR #14178:
URL: https://github.com/apache/lucene/pull/14178#issuecomment-3006409251

   Hi @HaoSunUber!
   
   Faiss supports multiple algorithms, vector transforms, quantizations, etc -- 
but I've primarily tested the full-precision pure HNSW v/s HNSW implementations 
of the Faiss and default Lucene codecs for this PR.
   
   I mainly ran benchmarks on 300d vectors of the `enwiki` dataset (some recent 
numbers 
[here](https://github.com/apache/lucene/pull/14178/#issuecomment-2954723052)) 
-- where the single segment search time was \~20% faster.
   
   The codec makes it possible to create different indexes (like say, scalar 
quantized, HNSW+PQ, etc) using different factory strings, see 
https://github.com/facebookresearch/faiss/wiki/The-index-factory -- but I 
haven't had a chance to test many others!
   
   I've tried to add steps to [install Faiss and make it available to the 
codec](https://github.com/apache/lucene/blob/4b47fb1a3113d22bca6cd8c1664529ef2d7f4877/lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/package-info.java#L36-L47)
 -- after which you can run KNN benchmarks using the 
https://github.com/mikemccand/luceneutil package (will need minor changes to 
use the format properly 
[here](https://github.com/mikemccand/luceneutil/blob/779d85551f37d72ef2d328165dd9a91b4bbf1f35/src/main/knn/KnnGraphTester.java#L1290))
   
   Please do post results if you're able to run any benchmarks, or have 
questions or feedback on the codec!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to