kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2621529365
### Usage The new format can be used by: - "Describing" the index you want, see https://github.com/facebookresearch/faiss/wiki/The-index-factory - Setting index parameters, see https://github.com/facebookresearch/faiss/wiki/Index-IO,-cloning-and-hyper-parameter-tuning For example, creating an HNSW index with `maxConn=32` and `beamWidth=200` is as simple as: ```java new FaissKnnVectorsFormatProvider("HNSW32", "efConstruction=200"); ``` Adding PQ to this index is as simple as: ```java new FaissKnnVectorsFormatProvider("HNSW32_PQ50", "efConstruction=200"); ``` Reordering the final results using exact distances is as simple as: ```java new FaissKnnVectorsFormatProvider("HNSW32_PQ50,RFlat", "efConstruction=200"); ``` ..and so on ### Benchmarks I built this PR using Java 22 and benchmarked it using [`knnPerfTest`](https://github.com/mikemccand/luceneutil/blob/main/src/python/knnPerfTest.py) (needed some small changes to add the sandbox JAR file to the classpath [here](https://github.com/mikemccand/luceneutil/blob/9764dffb3e00fc37a9edb4a55381010d4c60c26c/src/python/benchUtil.py#L1733) and the built Faiss shared library with its dependencies during runtime) Uses 300d documents and vectors generated using: ```sh ./gradlew vector-300 ``` from the [luceneutil](https://github.com/mikemccand/luceneutil) package Lucene: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.811 1.482 200000 100 50 32 200 no 52.65 3798.38 0.00 1 237.77 228.882 228.882 ``` Faiss: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.809 1.101 200000 100 50 32 200 no 6.06 33030.55 11.05 1 511.22 228.882 228.882 ``` Corresponding format used: ```java // efSearch is set as topK + fanout new FaissKnnVectorsFormatProvider("HNSW32", "efConstruction=200,efSearch=150"); ``` This is a single segment search with no deletes, but we see \~88% index-time speedup and \~26% search-time speedup! Used a fairly powerful machine (`m5.12xlarge`), and [default](https://github.com/mikemccand/luceneutil/blob/9764dffb3e00fc37a9edb4a55381010d4c60c26c/src/python/knnPerfTest.py#L65-L66) `numMergeWorker` and `numMergeThread` values One other thing to note is that the Faiss C_API does not use vectorized (AVX2 / AVX512 / SVE) instructions -- so we can squeeze some more performance out of it by building optimized versions of the library -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org