Re: [PR] Add a Faiss codec for KNN searches [lucene]

via GitHub Sat, 01 Feb 2025 02:45:39 -0800


kaivalnp commented on PR #14178:
URL: https://github.com/apache/lucene/pull/14178#issuecomment-2628899940


   I found one way to reduce index-time RAM usage -- turns out the 
[`FlatVectorsWriter`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/hnsw/FlatVectorsWriter.java)
 maintains a 
[list](https://github.com/apache/lucene/blob/faec0f823817ca95f1f103d6b9482d26ee75cc7b/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java#L407)
 of all vectors on heap before [writing them to 
disk](https://github.com/apache/lucene/blob/faec0f823817ca95f1f103d6b9482d26ee75cc7b/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java#L170-L177)
 on flush, and we need not use a 
[`BufferingKnnVectorsWriter`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/BufferingKnnVectorsWriter.java)
 for it
   
   In the flat format, the peak RAM usage is \~2x of vector size (once on heap, 
and once while allocating to a buffer just before writing to disk) and constant 
heap usage is \~1x of vector size (all on heap)
   
   In the Faiss format, we can read these vectors, copy them over to the native 
process, and start indexing them. The peak RAM usage is \~3x of vector size 
here (once on heap -- reusing the flat vectors, once as a copy in the native 
process, and once inside the native index). Previously, we were maintaining two 
copies of the vectors on heap (once in the flat format, and once in the 
buffering writer)
   
   We could reduce the peak RAM usage by indexing vectors in batches (and limit 
the size of the native copy required) -- but this would hurt indexing 
performance
   
   At merge time, we're now reading the disk-backed vectors from the flat 
format and directly adding them to the native process (so no copy on heap 
required)
   
   ---
   
   Single segment, no merges
   
   Lucene:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.812         1.389  200000   100      50       32        200         no   
146.49       1365.31           0.01             1           236.93        
228.882       228.882
   ```
   
   Faiss:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.811         1.103  200000   100      50       32        200         no   
148.38       1347.90           0.01             1           511.97        
228.882       228.882
   ```
   
   ---
   
   Single segment, with merges
   
   Lucene:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.809         1.366  200000   100      50       32        200         no   
103.58       1930.95         116.78             1           236.92        
228.882       228.882
   ```
   
   Faiss:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.811         1.104  200000   100      50       32        200         no   
114.90       1740.64         145.93             1           511.97        
228.882       228.882
   ```
   
   Merges are probably slower because we start fresh instead of [adding to 
existing 
indexes](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/IncrementalHnswGraphMerger.java)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Add a Faiss codec for KNN searches [lucene]

Reply via email to