rmuir commented on PR #14031:
URL: https://github.com/apache/lucene/pull/14031#issuecomment-2513295791

   I applied and tested the same approach with the other 2 functions too. 
cosine was already underweight: it is only unrolled twice due to complexity of 
the mathematical formula, but it keeps the floats consistent. we could tidy up 
the binary ones in similar fashion as a followup for more consistency, but 
since jvm can already unroll the integer math, they arent unrolled and i expect 
they are already under limit. microbenchmarks seem happy but I assume the real 
gains are from more macrobenchmark where the inlining can help.
   
   ```
   Before:
   Benchmark                                  (size)   Mode  Cnt   Score   
Error   Units "body" size
   VectorUtilBenchmark.floatCosineVector        1024  thrpt   75   8.216 ± 
0.026  ops/us 345 bytes
   VectorUtilBenchmark.floatDotProductVector    1024  thrpt   75  12.466 ± 
0.100  ops/us 355 bytes
   VectorUtilBenchmark.floatSquareVector        1024  thrpt   75  11.986 ± 
0.074  ops/us 400 bytes
   
   After:
   Benchmark                                  (size)   Mode  Cnt   Score   
Error   Units "body" size
   VectorUtilBenchmark.floatCosineVector        1024  thrpt   75   8.377 ± 
0.040  ops/us 320 bytes
   VectorUtilBenchmark.floatDotProductVector    1024  thrpt   75  12.917 ± 
0.113  ops/us 302 bytes
   VectorUtilBenchmark.floatSquareVector        1024  thrpt   75  11.965 ± 
0.089  ops/us 347 bytes
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to