kaivalnp opened a new pull request, #14178: URL: https://github.com/apache/lucene/pull/14178
### Description Faiss (https://github.com/facebookresearch/faiss) is _"a library for efficient similarity search and clustering of dense vectors"_ It supports various features like vector transforms (eg PCA), indexing algorithms (eg IVF, HNSW, etc), quantization techniques (eg PQ), search strategies (eg 2-step refinement), different hardware (including GPUs) -- all through a convenient and flexible _Index Factory_ (https://github.com/facebookresearch/faiss/wiki/The-index-factory) Proposing to add a wrapper to Lucene (via a new sandboxed `KnnVectorsFormat`) to create and search vector indexes with Faiss. OpenSearch has a [similar feature](https://github.com/opensearch-project/k-NN/blob/main/jni/src/faiss_wrapper.cpp), but that is implemented using JNI, which has its own overhead (need for "glue" code, separate build systems) This PR aims to have a pure Java implementation using the Panama (https://openjdk.org/projects/panama) Foreign Function Interface (FFI) to interact with the library. Faiss provides a nice C API to _"to produce bindings for programming languages with Foreign Function Interface (FFI) support"_ This PR _does not aim to add Faiss as a dependency of Lucene_, but requires the user to build the C API (https://github.com/facebookresearch/faiss/blob/main/c_api/INSTALL.md) and put the built shared executable (`libfaiss_c.so`) along with all dependencies (like OpenBLAS) on the Java library path (either the `-Djava.library.path` JVM argument or `$LD_LIBRARY_PATH` environment variable) More details and considerations to follow, but opening this PR to get feedback on the need, implementation, and long-term support for such a codec (i.e. can we keep this codec in Lucene) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org