15 regression] prime computation performance regression, x86, between gcc-14 and gcc-13 on skylake platform

haochen.jiang at intel dot com via Gcc-bugs Thu, 16 May 2024 01:54:21 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115025


--- Comment #5 from Haochen Jiang <haochen.jiang at intel dot com> ---
My guess is that for the prime judging loop:

        for (i = 5; i < max; i += 6)
                if ((n % i == 0) || (n % (i + 2) == 0))
                        return 0;

In GCC13, it extracts the first loop, which is (n % 5 == 0) || (n % 7 == 0),
out of the whole loop to do imul+cmp instead of div.

However, on current trunk, it still remains div and will be slower.

BTW, there is also a codegen regression which won't cause perf regression. On
current trunk, the sqrt BB is not merged together. It increases codesize but no
perf impact.

[Bug target/115025] [14/15 regression] prime computation performance regression, x86, between gcc-14 and gcc-13 on skylake platform

Reply via email to