Re: [PR] Fix off-heap byte vector scoring at query time [lucene]

via GitHub Mon, 14 Jul 2025 08:34:59 -0700


kaivalnp commented on PR #14874:
URL: https://github.com/apache/lucene/pull/14874#issuecomment-3070041519


   FYI I was trying to see the compiled code for the dot product function using 
`-XX:CompileCommand=print,*PanamaVectorUtilSupport.dotProductBody256` and saw 
that performance improved on adding that flag (no reindexing in any run)
   
   This PR without the flag:
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  
index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
    0.962        1.711   1.711        1.000  100000   100      50       64      
  250         no      0.00      Infinity            0.12             1          
 77.44       292.969      292.969       HNSW
   ```
   
   This PR with the flag:
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  
index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
    0.962        1.171   1.172        1.001  100000   100      50       64      
  250         no      0.00      Infinity            0.12             1          
 77.44       292.969      292.969       HNSW
   ```
   
   `main` without the flag:
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  
index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
    0.962        2.503   2.502        0.999  100000   100      50       64      
  250         no      0.00      Infinity            0.12             1          
 77.44       292.969      292.969       HNSW
   ```
   
   `main` with the flag:
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  
index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
    0.962        1.194   1.192        0.999  100000   100      50       64      
  250         no      0.00      Infinity            0.12             1          
 77.44       292.969      292.969       HNSW
   ```
   
   Perhaps the flag is forcing the function to be optimized by the JVM?
   In this case, it should denote the latency in a long-running application? 
(when the function is fully optimized by the compiler..)
   
   If so, I don't see a lot of value in merging this PR (there isn't a large 
improvement) -- but I'd love to make our benchmarks more robust and report 
something representative of a long-running application! (these 
non-deterministic compiler optimizations are too trappy)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Fix off-heap byte vector scoring at query time [lucene]

Reply via email to