Re: [PR] Fix off-heap byte vector scoring at query time [lucene]

via GitHub Wed, 09 Jul 2025 11:41:48 -0700


msokolov commented on PR #14874:
URL: https://github.com/apache/lucene/pull/14874#issuecomment-3053634936


   My weird results (w/768d cohere vectors).  In sum, it looks to me as if we 
have some noise immediately after reindexing.  Either this is a measurement 
artifact in `luceneutil` or it is a startup transient related to page caching / 
preloading / madvise / something. But after things settle doen this is clearly 
a bug improvement and we should merge, while also trying to understand this 
measurement problem -- which is pre-existsing and should not block this change.
   
   # run 1
   
   ## baseline
   first line does reindex -- and then latency and netCPU *increase*? This has 
nothing to do with this PR.
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited
    0.987       17.861      17.850                 0.999  500000   100      50  
     32                           250         no         25663
    0.987       25.272      25.256                 0.999  500000   100      50  
     32                           250         no         25663
    0.987       24.355      24.340                 0.999  500000   100      50  
     32                           250         no         25663
   ```
   
   ## candidate
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited
    0.987       13.261  13.251        0.999              500000   100      50   
               32                 250              no    25663
    0.987       13.366  13.356        0.999              500000   100      50   
               32                 250              no    25663
   ```
   
   # run 2
   ## candidate (reindex each time)
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited
   0.983       57.972       57.959        1.000         500000   100      50    
                32        250                      no    25648
   0.980       55.219       55.205        1.000         500000   100      50    
                32        250                      no    25802
   0.986       93.766       93.752        1.000         500000   100      50    
                32        250                      no    25630
   ```
   ### no reindex
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited
   0.986       13.170       13.159        0.999         500000   100      50    
                32        250                      no    25630
   0.986       13.411       13.400        0.999         500000   100      50    
                32        250                      no    25630
   0.986       13.535       13.523        0.999         500000   100      50    
                32        250                      no    25630
   ```
   ## baseline (no reindex)
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited
    0.986       26.048      26.033        0.999         500000   100      50    
               32        250                       no    25630
    0.986       25.954      25.938        0.999         500000   100      50    
               32        250                       no    25630
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Fix off-heap byte vector scoring at query time [lucene]

Reply via email to