ashvardanian opened a new issue, #12502:
URL: https://github.com/apache/lucene/issues/12502

   ### Description
   
   I was recently approached by Lucene and Elastic users, facing low 
performance and high memory consumption issues, running Vector Search tasks on 
JVM. Some have also been using native libraries, like our 
[USearch](https://github.com/unum-cloud/usearch), and were curious if those 
systems can be combined. Hence, here I am, excited to open a discussion 🤗 
   
   cc @jbellis, @benwtrent, @alessandrobenedetti, @msokolov
   
   ---
   
   I have looked into the existing HNSW implementation and related PR - #10047. 
The integration should be simple, assuming [we already have a JNI, that passes 
CI and is hosted on 
GitHub](https://github.com/unum-cloud/usearch/packages/1867475). The upside 
would be:
   
   - the performance won't be just on par with FAISS but can be higher.
   - cross-platform `f16` support and `i8` optional automatic downcasting.
   - indexes can be memory-mapped from disk without loading into RAM and are 
about to receive many `io_uring`-based kernel-bypass tricks, similar to what we 
have in [UCall](https://github.com/unum-cloud/ucall).
   
   ---
   
   This may automatically resolve the following issues (in reverse 
chronological order):
   
   - [x] half-precision support: #12403
   - [x] multi-key support: #12313 
   - [x] pluggable metrics, similar to our JIT support in Python: #12219
   - [x] 2K+ dimensional vectors: #11507
   - [x] compact offsets with `uint40_t`: #10884
   - [x] memory consumption: #10177
   
   ---
   
   As far as I understand, it is not common to integrate Lucene with native 
libraries, but it seems like it can be justified in such 
computationally-intensive workloads. 
   
   |              | FAISS, `f32` | USearch, `f32` | USearch, `f16` |     
USearch, `i8` |
   | :----------- | -----------: | -------------: | -------------: | 
----------------: |
   | Batch Insert |       16 K/s |         73 K/s |        100 K/s | 104 K/s 
**+550%** |
   | Batch Search |       82 K/s |        103 K/s |        113 K/s |  134 K/s 
**+63%** |
   | Bulk Insert  |       76 K/s |        105 K/s |        115 K/s | 202 K/s 
**+165%** |
   | Bulk Search  |      118 K/s |        174 K/s |        173 K/s | 304 K/s 
**+157%** |
   | Recall @ 10  |          99% |          99.2% |          99.1% |            
 99.2% |
   
   > Dataset: 1M vectors sample of the Deep1B dataset. Hardware: `c7g.metal` 
AWS instance with 64 cores and DDR5 memory. HNSW was configured with identical 
hyper-parameters: connectivity `M=16`, expansion @ construction 
`efConstruction=128`, and expansion @ search `ef=64`. Batch size is 256. Both 
libraries were compiled for the target architecture.
   
   I am happy to contribute, and looking forward to your comments 🤗


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to