rmuir commented on PR #12787: URL: https://github.com/apache/lucene/pull/12787#issuecomment-1803343093
When I run `make PATCH_BRANCH=rmuir:microbenchmark_ec2` we will just see no differences but it demonstrates it (sorry: no speedups in this branch!). It spins up/tears down `lucene-jmh` cloudformation stack with all the instances, this way things are organized in your account. If something goes wrong with the script just delete the entire stack from AWS console yourself if you want. For any automation/concurrent runs change the stack name to e.g. env var of job ID, it also gives good separation for that without the overhead and limits of separate VPCs.  There's a few minutes of overhead spinning up and configuring the machines, but at least it is all in parallel, mostly dominated by `./gradlew assemble` which takes the longest. Actually running all the JMH benchmarks takes 30 minutes, it is what it is. It is not the overhead of this script. If you want to run a specific one, just use `JMH_ARGS` (see the README) and you'll get results much faster. Here is the breakdown of time spent: |Task|Time| |-------|-------| |Run benchmark|1502.06s| |Assemble Sources|285.18s| |Reboot machine|33.89s| |Create cloudformation stack|26.70s| |Install packages|21.36s| |Download JDK|20.57s| |Checkout main|10.46s| |Checkout patch|9.39s| |Wait for connection|7.36s| |Gather facts| 2.12s| |Write Report |2.02s| |Configure kernel |1.17s| |Lookup default VPC |0.98s| |Gather instance details |0.96s| |Read main results | 0.67s| |Configure JDK |0.59s| |Configure Gradle |0.46s| |Read patch results | 0.35s| |Add instances to inventory |0.24s| |Create combined report |0.21s| full benchmark run costs less than $1USD, I am pretty sure. Output format is "big ass PR comment" format. I know its not the best, but i don't feel like parsing json. Output run against this branch: cascadelake: `['0', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz', '1', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.932 ± 0.002 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 6.390 ± 0.014 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.506 ± 0.005 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 13.929 ± 0.006 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.920 ± 0.030 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 11.088 ± 0.124 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.639 ± 0.007 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 8.896 ± 0.070 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.388 ± 0.046 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 13.770 ± 0.079 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.625 ± 0.011 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 12.385 ± 0.131 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.932 ± 0.002 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 6.396 ± 0.008 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.505 ± 0.005 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 13.759 ± 0.246 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.928 ± 0.005 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 11.137 ± 0.126 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.638 ± 0.007 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 9.473 ± 0.203 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.385 ± 0.046 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 13.900 ± 0.114 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.629 ± 0.002 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 12.676 ± 0.262 ops/us ``` graviton2: `['0', '1']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.808 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 1.254 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.386 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 2.309 ± 0.016 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.909 ± 0.002 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 1.861 ± 0.002 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.569 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 5.376 ± 0.054 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 2.072 ± 0.070 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 6.409 ± 0.172 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 1.752 ± 0.001 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 6.141 ± 0.038 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.808 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 1.254 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.386 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 2.327 ± 0.002 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.911 ± 0.004 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 1.862 ± 0.001 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.570 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 5.388 ± 0.049 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 2.097 ± 0.036 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 6.332 ± 0.113 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 1.752 ± 0.001 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 6.135 ± 0.042 ops/us ``` graviton3: `['0', '1']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.842 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 4.421 ± 0.009 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.370 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 6.954 ± 0.011 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 2.466 ± 0.026 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 5.848 ± 0.037 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.422 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 6.272 ± 0.025 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.739 ± 0.057 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 9.828 ± 0.224 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.182 ± 0.045 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 9.125 ± 0.045 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.842 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 4.423 ± 0.002 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.370 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 6.960 ± 0.007 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 2.471 ± 0.020 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 5.849 ± 0.011 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.421 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 6.264 ± 0.029 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.755 ± 0.003 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 9.877 ± 0.250 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.207 ± 0.012 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 9.113 ± 0.044 ops/us ``` haswell: `['0', 'GenuineIntel', 'Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz', '1', 'GenuineIntel', 'Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.728 ± 0.008 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 3.586 ± 0.008 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.011 ± 0.004 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 7.915 ± 0.011 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.539 ± 0.003 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 6.939 ± 0.009 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.308 ± 0.005 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 7.453 ± 0.067 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 2.154 ± 0.046 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 12.245 ± 0.117 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.351 ± 0.061 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 11.299 ± 0.219 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.730 ± 0.002 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 3.586 ± 0.007 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.012 ± 0.004 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 7.919 ± 0.009 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.539 ± 0.003 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 6.941 ± 0.005 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.305 ± 0.011 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 7.488 ± 0.068 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 2.153 ± 0.047 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 12.292 ± 0.101 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.368 ± 0.001 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 11.387 ± 0.202 ops/us ``` icelake: `['0', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz', '1', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.842 ± 0.004 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 7.346 ± 0.008 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.585 ± 0.007 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 16.481 ± 0.030 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.807 ± 0.010 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 14.150 ± 0.077 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.524 ± 0.004 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 9.790 ± 0.354 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.305 ± 0.021 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 13.135 ± 0.152 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.243 ± 0.005 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 11.529 ± 0.291 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.843 ± 0.004 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 7.351 ± 0.014 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.586 ± 0.004 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 16.410 ± 0.071 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.809 ± 0.008 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 14.090 ± 0.067 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.525 ± 0.005 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 9.872 ± 0.346 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.304 ± 0.020 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 13.162 ± 0.151 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.239 ± 0.005 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 11.574 ± 0.345 ops/us ``` sapphirerapids: `['0', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8488C', '1', 'GenuineIntel', 'Intel(R) Xeon(R) Platinum 8488C']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 1.134 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 8.877 ± 0.004 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.826 ± 0.012 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 18.573 ± 0.009 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 2.701 ± 0.050 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 16.520 ± 0.294 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.702 ± 0.004 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 14.571 ± 0.251 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.715 ± 0.018 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 22.118 ± 0.619 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.933 ± 0.011 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 21.969 ± 0.103 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 1.121 ± 0.001 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 8.738 ± 0.042 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 2.797 ± 0.025 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 18.222 ± 0.009 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 2.663 ± 0.003 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 16.051 ± 0.295 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.680 ± 0.014 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 14.678 ± 0.255 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.473 ± 0.297 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 18.335 ± 0.435 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.518 ± 0.001 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 16.154 ± 0.268 ops/us ``` zen2: `['0', 'AuthenticAMD', 'AMD EPYC 7R32', '1', 'AuthenticAMD', 'AMD EPYC 7R32']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.497 ± 0.003 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 3.771 ± 0.012 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.543 ± 0.009 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 9.977 ± 0.004 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.276 ± 0.002 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 9.034 ± 0.024 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.184 ± 0.007 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 8.380 ± 0.025 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.089 ± 0.022 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 15.269 ± 0.401 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.449 ± 0.031 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 15.326 ± 0.297 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.493 ± 0.003 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 3.769 ± 0.022 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.548 ± 0.008 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 9.918 ± 0.038 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.273 ± 0.004 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 9.008 ± 0.026 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.187 ± 0.017 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 8.411 ± 0.025 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.086 ± 0.037 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 15.775 ± 0.494 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 2.482 ± 0.009 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 14.919 ± 0.081 ops/us ``` zen3: `['0', 'AuthenticAMD', 'AMD EPYC 7R13 Processor', '1', 'AuthenticAMD', 'AMD EPYC 7R13 Processor']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.785 ± 0.004 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 5.453 ± 0.029 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.579 ± 0.001 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 9.803 ± 0.010 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.269 ± 0.006 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 9.349 ± 0.012 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.346 ± 0.008 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 10.494 ± 0.060 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.395 ± 0.019 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 16.544 ± 0.326 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.004 ± 0.002 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 16.070 ± 0.233 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.779 ± 0.010 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 5.463 ± 0.004 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.578 ± 0.002 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 9.790 ± 0.037 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.271 ± 0.002 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 9.363 ± 0.010 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.347 ± 0.006 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 10.492 ± 0.033 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.400 ± 0.015 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 16.568 ± 0.405 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.007 ± 0.001 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 16.719 ± 0.441 ops/us ``` zen4: `['0', 'AuthenticAMD', 'AMD EPYC 9R14', '1', 'AuthenticAMD', 'AMD EPYC 9R14']` main ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.668 ± 0.003 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 8.784 ± 0.094 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.856 ± 0.002 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 22.390 ± 0.071 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.542 ± 0.001 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 18.104 ± 0.055 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.763 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 13.427 ± 0.146 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.579 ± 0.014 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 16.396 ± 0.477 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.561 ± 0.004 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 16.581 ± 0.494 ops/us ``` patch ``` Benchmark (size) Mode Cnt Score Error Units VectorUtilBenchmark.binaryCosineScalar 1024 thrpt 15 0.669 ± 0.004 ops/us VectorUtilBenchmark.binaryCosineVector 1024 thrpt 15 8.773 ± 0.092 ops/us VectorUtilBenchmark.binaryDotProductScalar 1024 thrpt 15 1.855 ± 0.003 ops/us VectorUtilBenchmark.binaryDotProductVector 1024 thrpt 15 22.408 ± 0.044 ops/us VectorUtilBenchmark.binarySquareScalar 1024 thrpt 15 1.540 ± 0.001 ops/us VectorUtilBenchmark.binarySquareVector 1024 thrpt 15 18.165 ± 0.130 ops/us VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15 1.763 ± 0.001 ops/us VectorUtilBenchmark.floatCosineVector 1024 thrpt 75 13.461 ± 0.147 ops/us VectorUtilBenchmark.floatDotProductScalar 1024 thrpt 15 3.580 ± 0.014 ops/us VectorUtilBenchmark.floatDotProductVector 1024 thrpt 75 15.989 ± 0.319 ops/us VectorUtilBenchmark.floatSquareScalar 1024 thrpt 15 3.562 ± 0.002 ops/us VectorUtilBenchmark.floatSquareVector 1024 thrpt 75 16.091 ± 0.477 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org