leng25 opened a new pull request, #15790:
URL: https://github.com/apache/lucene/pull/15790

   ## Summary
   
   This PR implements the optimization suggested in #15024, replacing the 
two-step prefix sum loop in `Lucene99HnswVectorsReader` with a single-pass 
accumulator variant that avoids redundant memory reads.
   
   **Before:**
   ```java
   currentNeighborsBuffer[0] = dataIn.readVInt();
   for (int i = 1; i < arcCount; i++) {
     currentNeighborsBuffer[i] = currentNeighborsBuffer[i - 1] + 
dataIn.readVInt();
   }
   ```
   
   **After:**
   ```java
   int sum = 0;
   for (int i = 0; i < arcCount; i++) {
     sum += dataIn.readVInt();
     currentNeighborsBuffer[i] = sum;
   }
   ```
   
   This is a follow-up to #15027 by @yossev who proposed the same fix. Since 
that PR went stale (merge conflicts, formatting), I'm resubmitting with 
conflicts resolved, formatting fixed via `./gradlew tidy`, and benchmark 
results included.
   
   I found this while looking for a good first issue to learn the contribution 
process — happy to adjust anything based on feedback!
   
   ## Benchmark Results
   
   Benchmarks were run using 
[luceneutil](https://github.com/mikemccand/luceneutil) KNN benchmark 
(`knnPerfTest.py`).
   
   **Machine:** Intel Core i5-10210U, 8 logical cores, ~15 GB RAM
   **Dataset:** cohere-v3-wikipedia-en 1024d, 400k docs, 10k queries, 8-bit 
quantized, dot_product
   
   **Baseline:**
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  
num_segments  index_size(MB)
    0.977        9.920   9.893        0.997  400000   100     100       64      
  250     8 bits     7955    486.32        822.50          437.90             1 
        2015.68
   ```
   
   **Candidate (this PR):**
   ```
   recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  
beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  
num_segments  index_size(MB)
    0.977        9.861   9.833        0.997  400000   100     100       64      
  250     8 bits     7955    486.32        822.50          437.90             1 
        2015.68
   ```
   
   Recall is identical. Results are from a single run so small differences may 
fall within normal measurement variance.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to