rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1762068277
I compiled the code and ran it easily, just `git clone + make`. You do have to run it as root to get the useful output, I took a risk on my machine: ``` think:avx-turbo[master]$ sudo ./avx-turbo CPUID highest leaf : [16h] Running as root : [YES] MSR reads supported : [YES] CPU pinning enabled : [YES] CPU supports zeroupper: [YES] CPU supports AVX2 : [YES] CPU supports AVX-512F : [NO ] CPU supports AVX-512VL: [NO ] CPU supports AVX-512BW: [NO ] CPU supports AVX-512CD: [NO ] cpuid = eax = 2, ebx = 208, ecx = 0, edx = 0 cpu: family = 6, model = 78, stepping = 3 tsc_freq = 2496.0 MHz (from cpuid leaf 0x15) CPU brand string: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz 4 available CPUs: [0, 1, 2, 3] 2 physical cores: [0, 1] Will test up to 2 CPUs Cores | ID | Description | OVRLP3 | Mops | A/M-ratio | A/M-MHz | M/tsc-ratio 1 | pause_only | pause instruction | 1.000 | 2116 | 1.20 | 2993 | 0.98 1 | ucomis_clean | scalar ucomis (w/ vzeroupper) | 1.000 | 742 | 1.20 | 2991 | 0.98 1 | ucomis_dirty | scalar ucomis (no vzeroupper) | 1.000 | 717 | 1.18 | 2934 | 0.98 1 | scalar_iadd | Scalar integer adds | 1.000 | 2993 | 1.20 | 2989 | 0.97 1 | avx128_iadd | 128-bit integer serial adds | 1.000 | 2993 | 1.20 | 2990 | 0.98 1 | avx256_iadd | 256-bit integer serial adds | 1.000 | 2993 | 1.20 | 2990 | 0.97 1 | avx128_iadd_t | 128-bit integer parallel adds | 1.000 | 8973 | 1.19 | 2983 | 0.96 1 | avx256_iadd_t | 256-bit integer parallel adds | 1.000 | 8977 | 1.20 | 2995 | 0.99 1 | avx128_xor_zero | 128-bit zeroing xor | 1.000 | 11850 | 1.20 | 2995 | 0.99 1 | avx256_xor_zero | 256-bit zeroing xor | 1.000 | 11857 | 1.20 | 2995 | 0.99 1 | avx128_mov_sparse | 128-bit reg-reg mov | 1.000 | 2970 | 1.20 | 2988 | 0.97 1 | avx256_mov_sparse | 256-bit reg-reg mov | 1.000 | 2993 | 1.20 | 2995 | 0.99 1 | avx128_vshift | 128-bit variable shift (vpsrlvd) | 1.000 | 2993 | 1.20 | 2994 | 0.99 1 | avx256_vshift | 256-bit variable shift (vpsrlvd) | 1.000 | 2993 | 1.20 | 2991 | 0.98 1 | avx128_vshift_t | 128-bit variable shift (vpsrlvd) | 1.000 | 5985 | 1.20 | 2993 | 0.98 1 | avx256_vshift_t | 256-bit variable shift (vpsrlvd) | 1.000 | 5986 | 1.20 | 2994 | 0.99 1 | avx128_imul | 128-bit integer muls (vpmuldq) | 1.000 | 599 | 1.20 | 2992 | 0.98 1 | avx256_imul | 256-bit integer muls (vpmuldq) | 1.000 | 599 | 1.20 | 2993 | 0.98 1 | avx128_fma_sparse | 128-bit 64-bit sparse FMAs | 1.000 | 2993 | 1.20 | 2992 | 0.98 1 | avx256_fma_sparse | 256-bit 64-bit sparse FMAs | 1.000 | 2993 | 1.20 | 2991 | 0.98 1 | avx128_fma | 128-bit serial DP FMAs | 1.000 | 748 | 1.20 | 2986 | 0.98 1 | avx256_fma | 256-bit serial DP FMAs | 1.000 | 748 | 1.20 | 2991 | 0.97 1 | avx128_fma_t | 128-bit parallel DP FMAs | 1.000 | 5986 | 1.20 | 2993 | 0.98 1 | avx256_fma_t | 256-bit parallel DP FMAs | 1.000 | 5986 | 1.20 | 2995 | 0.99 Cores | ID | Description | OVRLP3 | Mops | A/M-ratio | A/M-MHz | M/tsc-ratio 2 | pause_only | pause instruction | 1.000 | 1996, 1994 | 1.16, 1.16 | 2884, 2886 | 1.00, 1.00 2 | ucomis_clean | scalar ucomis (w/ vzeroupper) | 1.000 | 717, 717 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | ucomis_dirty | scalar ucomis (no vzeroupper) | 1.000 | 717, 717 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | scalar_iadd | Scalar integer adds | 1.000 | 2865, 2888 | 1.16, 1.16 | 2895, 2896 | 1.00, 1.00 2 | avx128_iadd | 128-bit integer serial adds | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_iadd | 256-bit integer serial adds | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_iadd_t | 128-bit integer parallel adds | 1.000 | 8678, 8679 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_iadd_t | 256-bit integer parallel adds | 1.000 | 8680, 8681 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_xor_zero | 128-bit zeroing xor | 1.000 | 11456, 11460 | 1.16, 1.16 | 2896, 2895 | 1.00, 1.00 2 | avx256_xor_zero | 256-bit zeroing xor | 1.000 | 11457, 11459 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_mov_sparse | 128-bit reg-reg mov | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_mov_sparse | 256-bit reg-reg mov | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_vshift | 128-bit variable shift (vpsrlvd) | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_vshift | 256-bit variable shift (vpsrlvd) | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_vshift_t | 128-bit variable shift (vpsrlvd) | 1.000 | 5787, 5787 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_vshift_t | 256-bit variable shift (vpsrlvd) | 1.000 | 5786, 5787 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_imul | 128-bit integer muls (vpmuldq) | 1.000 | 579, 579 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_imul | 256-bit integer muls (vpmuldq) | 1.000 | 579, 579 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_fma_sparse | 128-bit 64-bit sparse FMAs | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_fma_sparse | 256-bit 64-bit sparse FMAs | 1.000 | 2893, 2893 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_fma | 128-bit serial DP FMAs | 1.000 | 723, 723 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx256_fma | 256-bit serial DP FMAs | 1.000 | 723, 723 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 2 | avx128_fma_t | 128-bit parallel DP FMAs | 1.000 | 5785, 5787 | 1.16, 1.16 | 2896, 2896 | 1.00, 1.00 2 | avx256_fma_t | 256-bit parallel DP FMAs | 1.000 | 5786, 5787 | 1.16, 1.16 | 2895, 2895 | 1.00, 1.00 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org