ashvardanian opened a new issue, #12502: URL: https://github.com/apache/lucene/issues/12502
### Description I was recently approached by Lucene and Elastic users, facing low performance and high memory consumption issues, running Vector Search tasks on JVM. Some have also been using native libraries, like our [USearch](https://github.com/unum-cloud/usearch), and were curious if those systems can be combined. Hence, here I am, excited to open a discussion 🤗 cc @jbellis, @benwtrent, @alessandrobenedetti, @msokolov --- I have looked into the existing HNSW implementation and related PR - #10047. The integration should be simple, assuming [we already have a JNI, that passes CI and is hosted on GitHub](https://github.com/unum-cloud/usearch/packages/1867475). The upside would be: - the performance won't be just on par with FAISS but can be higher. - cross-platform `f16` support and `i8` optional automatic downcasting. - indexes can be memory-mapped from disk without loading into RAM and are about to receive many `io_uring`-based kernel-bypass tricks, similar to what we have in [UCall](https://github.com/unum-cloud/ucall). --- This may automatically resolve the following issues (in reverse chronological order): - [x] half-precision support: #12403 - [x] multi-key support: #12313 - [x] pluggable metrics, similar to our JIT support in Python: #12219 - [x] 2K+ dimensional vectors: #11507 - [x] compact offsets with `uint40_t`: #10884 - [x] memory consumption: #10177 --- As far as I understand, it is not common to integrate Lucene with native libraries, but it seems like it can be justified in such computationally-intensive workloads. | | FAISS, `f32` | USearch, `f32` | USearch, `f16` | USearch, `i8` | | :----------- | -----------: | -------------: | -------------: | ----------------: | | Batch Insert | 16 K/s | 73 K/s | 100 K/s | 104 K/s **+550%** | | Batch Search | 82 K/s | 103 K/s | 113 K/s | 134 K/s **+63%** | | Bulk Insert | 76 K/s | 105 K/s | 115 K/s | 202 K/s **+165%** | | Bulk Search | 118 K/s | 174 K/s | 173 K/s | 304 K/s **+157%** | | Recall @ 10 | 99% | 99.2% | 99.1% | 99.2% | > Dataset: 1M vectors sample of the Deep1B dataset. Hardware: `c7g.metal` AWS instance with 64 cores and DDR5 memory. HNSW was configured with identical hyper-parameters: connectivity `M=16`, expansion @ construction `efConstruction=128`, and expansion @ search `ef=64`. Batch size is 256. Both libraries were compiled for the target architecture. I am happy to contribute, and looking forward to your comments 🤗 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org