Re: [PR] Implement off-heap quantized scoring [lucene]

via GitHub Sun, 29 Jun 2025 03:47:20 -0700


kaivalnp commented on PR #14863:
URL: https://github.com/apache/lucene/pull/14863#issuecomment-3016581312


   FYI I observed a strange phenomenon where if the query vector is on heap 
like:
   ```java
   this.query = MemorySegment.ofArray(targetBytes);
   ```
   
   instead of the current off-heap implementation in this PR:
   ```java
   this.query = Arena.ofAuto().allocateFrom(JAVA_BYTE, targetBytes);
   ```
   
   ..then we see a performance regression:
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  
index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
    0.862        3.043   3.034        0.997  100000   100      50       64      
  250     7 bits     23.25       4301.82           25.29             1          
373.70       366.592       73.624       HNSW
    0.545        2.060   2.049        0.995  100000   100      50       64      
  250     4 bits     22.19       4506.33           17.99             1          
338.17       329.971       37.003       HNSW
   ```
   
   Maybe I'm missing something obvious, but I haven't found the root cause yet..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Implement off-heap quantized scoring [lucene]

Reply via email to