kaivalnp opened a new pull request, #14178:
URL: https://github.com/apache/lucene/pull/14178

   ### Description
   
   Faiss (https://github.com/facebookresearch/faiss) is _"a library for 
efficient similarity search and clustering of dense vectors"_
   
   It supports various features like vector transforms (eg PCA), indexing 
algorithms (eg IVF, HNSW, etc), quantization techniques (eg PQ), search 
strategies (eg 2-step refinement), different hardware (including GPUs) -- all 
through a convenient and flexible _Index Factory_ 
(https://github.com/facebookresearch/faiss/wiki/The-index-factory)
   
   Proposing to add a wrapper to Lucene (via a new sandboxed 
`KnnVectorsFormat`) to create and search vector indexes with Faiss. OpenSearch 
has a [similar 
feature](https://github.com/opensearch-project/k-NN/blob/main/jni/src/faiss_wrapper.cpp),
 but that is implemented using JNI, which has its own overhead (need for "glue" 
code, separate build systems)
   
   This PR aims to have a pure Java implementation using the Panama 
(https://openjdk.org/projects/panama) Foreign Function Interface (FFI) to 
interact with the library. Faiss provides a nice C API to _"to produce bindings 
for programming languages with Foreign Function Interface (FFI) support"_
   
   This PR _does not aim to add Faiss as a dependency of Lucene_, but requires 
the user to build the C API 
(https://github.com/facebookresearch/faiss/blob/main/c_api/INSTALL.md) and put 
the built shared executable (`libfaiss_c.so`) along with all dependencies (like 
OpenBLAS) on the Java library path (either the `-Djava.library.path` JVM 
argument or `$LD_LIBRARY_PATH` environment variable)
   
   More details and considerations to follow, but opening this PR to get 
feedback on the need, implementation, and long-term support for such a codec 
(i.e. can we keep this codec in Lucene)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to