[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code

ysrumyan at gmail dot com Thu, 06 Aug 2015 02:32:29 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309


--- Comment #33 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
With current compiler there is not performance difference for by-ref and by-val
test-cases, but if we turn off if-convert transformation we will get ~2X
speed-up:
on Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz

 ./t1.exe
Took 11.55 seconds total.
 ./t1.noifcvt.exe                            
Took 6.51 seconds total.

The test will be attached.
This is caused by skew conditional branch probabilities for the loop:

    for (auto rhs_it = rbegin; rhs_it != rend; ++rhs_it) {
            tmp = x*(*rhs_it) + data[i] + carry;
            if (tmp >= imax) {
                    carry = tmp >> numbits;
                    tmp &= imax - 1;
            } else {
                    carry = 0;
            }
            data[i++] = tmp;
    }

Only 2.5% conditional branches are not taken since imax represents MAX_INT32
and profile estimation phase needs to be fixed to set-up unlikely probability
for integral comparison with huge constants.
To coupe with this issue we may implement Jakub approach to design Oracle for
if-conversion profitability which simply computes region (loop) costs for
if-converted and not-if-converted regions ( cost of all acyclic paths).
Using such approach we can see that for fixed profile hammock predication is
not profitable and if vectorization will not be successful loop must be
restored to orginal one.

[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code

Reply via email to