ChrisHegarty commented on PR #12731: URL: https://github.com/apache/lucene/pull/12731#issuecomment-1783869078
ha! So just removing the overly aggressive unrolling in cosine improves things. The check on FMA is nice - I had similar thoughts ( you just beat me to it! ), and it inlines nicely. I also agree, we don't wanna use FMA on ARM, it performs 10-15% worse on my M2. Sanity results from my Rocket Lake: main: ``` VectorUtilBenchmark.floatCosineScalar 1024 thrpt 5 0.845 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 5 8.885 ± 0.005 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 5 3.406 ± 0.018 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 5 26.168 ± 0.009 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 5 2.549 ± 0.005 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 5 19.283 ± 0.001 ops/us ``` Robert's branch: ``` VectorUtilBenchmark.floatCosineScalar 1024 thrpt 5 0.845 ± 0.003 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 5 14.636 ± 0.016 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 5 3.400 ± 0.083 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 5 27.265 ± 0.065 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 5 2.548 ± 0.012 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 5 25.529 ± 0.207 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org