benwtrent opened a new issue, #12621: URL: https://github.com/apache/lucene/issues/12621
### Description While testing and digging around, I noticed that our float comparisons are way faster than byte on my Macbook (M1) and pretty much the same as our byte comparisons on a GCP Intel Sapphire Rapids CPU. This seems counter-intuitive to me. I would expect Panama to be able to do more `byte` operations per cycle than `float`. My guess is the intrinsics are weird? Panama Vector just doesn't support or detect the required operations? Here are two benchmark results using @rmuir's helpful vectorbench project: MacBook (Apple Silicon [128bits], JDK21): ``` FloatDotProductBenchmark.dotProductNew 768 thrpt 5 21.781 ± 0.254 ops/us FloatDotProductBenchmark.dotProductNew 1024 thrpt 5 15.091 ± 0.217 ops/us BinaryDotProductBenchmark.dotProductNew 768 thrpt 5 8.041 ± 0.108 ops/us BinaryDotProductBenchmark.dotProductNew 1024 thrpt 5 6.085 ± 0.133 ops/us ``` GCP (Intel Sapphire Rapids [avx512], JDK21): ``` FloatDotProductBenchmark.dotProductNew 768 thrpt 5 20.169 ± 0.385 ops/us FloatDotProductBenchmark.dotProductNew 1024 thrpt 5 18.334 ± 0.180 ops/us BinaryDotProductBenchmark.dotProductNew 768 thrpt 5 19.686 ± 0.050 ops/us BinaryDotProductBenchmark.dotProductNew 1024 thrpt 5 14.934 ± 0.014 ops/us ``` <details> <summary>cpu-flags</summary> ``` Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflus h mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good n opl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pc id sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf _lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsba se tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xs avec xgetbv1 xsaves avx512_bf16 arat avx512vbmi umip avx512_vbmi2 gfni vaes vp clmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid cldemote movdiri mov dir64b fsrm md_clear serialize arch_capabilities ``` </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org