RamakrishnaChilaka opened a new pull request, #15358:
URL: https://github.com/apache/lucene/pull/15358

   ### Description
   
   This change optimizes `ForUtil` bulk integer encoding by introducing a 
batched `writeInts()` method in the `DataOutput` interface, allowing integers 
to be written in a single bulk operation.
   It also precomputes masks to reduce branching overhead and improve encoding 
efficiency.
   
   ### Benchmarks
   ```
   baseline
   Benchmark                              (bitsPerValue)   Mode  Cnt   Score   
Error   Units
   ForUtilEncodeBulkIntsBenchmark.encode               2  thrpt    5  15.331 ± 
0.304  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode               4  thrpt    5  12.394 ± 
0.034  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode               8  thrpt    5   8.274 ± 
0.053  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              12  thrpt    5   4.321 ± 
0.009  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              16  thrpt    5   4.409 ± 
0.177  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              20  thrpt    5   2.329 ± 
0.009  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              24  thrpt    5   2.187 ± 
0.032  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              28  thrpt    5   1.959 ± 
0.008  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              32  thrpt    5   2.356 ± 
0.005  ops/us
   
   
   candidate
   Benchmark                              (bitsPerValue)   Mode  Cnt   Score   
Error   Units
   ForUtilEncodeBulkIntsBenchmark.encode               2  thrpt    5  19.555 ± 
0.073  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode               4  thrpt    5  18.772 ± 
0.046  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode               8  thrpt    5  15.424 ± 
0.010  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              12  thrpt    5   7.651 ± 
0.014  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              16  thrpt    5   9.577 ± 
0.045  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              20  thrpt    5   3.980 ± 
0.005  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              24  thrpt    5   3.821 ± 
0.010  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              28  thrpt    5   4.001 ± 
0.002  ops/us
   ForUtilEncodeBulkIntsBenchmark.encode              32  thrpt    5   5.917 ± 
0.009  ops/us
   ```
   
   ### Summary of benchmark results
   Benchmark results show significant encode throughput improvements:
    - +25–50% for low bitsPerValue (2–4 bits)
    - ~2× speedup for 8–28 bits
    - Up to 2.5× faster at 32 bits
   
   Overall, encoding throughput improved by ~1.8× on average across all tested 
bit widths.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to