Re: [PUSHED 2/2] ifcvt: Move noce_try_cond_zero_arith last

Jeffrey Law Fri, 26 Dec 2025 11:18:06 -0800


On 12/23/2025 11:54 PM, Andrew Pinski wrote:

On Tue, Dec 23, 2025 at 8:33 PM Jeffrey Law <[email protected]> wrote:


On 12/23/2025 6:44 PM, Andrew Pinski wrote:

I noticed that on x86_64 and aarch64, noce_try_cond_zero_arith
would produce worse code than noce_try_cmove_arith.
So we should do noce_try_cond_zero_arith last instead
of before noce_try_cmove_arith.

Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
Also checked to make sure riscv testcases still work.

gcc/ChangeLog:

       * ifcvt.cc (noce_process_if_block): Move noce_try_cond_zero_arith
       last.

Please no.   We very much want to use condzero_arith rather than cmove
based things -- that would be pretty bad in general for RISC-V.  We
really should dive into why the code isn't as good as we'd like on other
patforms.

I looked and I noticed noce_try_cmove_arith fails for riscv for most
(all?) of the testcases I tried.

We may not have good coverage here. But it was a huge source ofperformance issues with gcc-15 -- way too many generalized conditionalmoves that should have been condzero style sequences.

I noticed in some cases noce_convert_multiple_sets happened even
before noce_try_cond_zero_arith had a chance to do its thing.
The code generation from noce_convert_multiple_sets is worse for riscv
for sure and noce_convert_multiple_sets happens even before
noce_try_cond_zero_arith and noce_try_cmove_arith could even happen.

Note looking into cases where noce_try_cond_zero_arith fails (and
noce_try_cmove_arith also fails) on riscv I find that the check:
```
   if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx))
     return false;
```
is too restrictive but that is a different story.

But that's capturing the key concept. Namely that we want to target aconditional zero idiom rather than a conditional move idiom. RISC-V hasinstructions for the former. The latter requires a pair of conditionalzeros with opposite polarity on the test *and* an additional instructionto select one of the outputs from the pair of conditional zeros.

Now on to the question about other targets.
The question here is `a OP= (cond ? 0 : b)` better than  `a = (cond ?
a : a OP b)`.
LLVM seems to always do as `a OP= (cond ? 0 : b)`. (except for & where
they do `a &= cond ? -1 : b`).
I think both for aarch64 are ok, for x86_64, I saw a notice that doing
the conditional move before the operation is better on some
micro-architectures

The forms with explicit zeros are definitely preferred for RISC-V asthose correspond to czero instructions. That is precisely the form thatcondzero_arith is targeting.

Let's take a conditional shift by 6 (since that's important for one ofthe spec2017 benchmarks, I forget which).


Good code for riscv would look like

    li t0,6

    czero.eqz t0,t0,<condreg>

    sll dest,src,t0


Contrast to a conditional move sequence which will look like:


    slli tmp1,src,6

    czero.eqz tmp1,tmp,<condreg>

    czero.nez tmp2,src,<condreg>

    add dest,tmp1,tmp2


Or worse yet, branching...

BUT the `&` case is worse without this patch.
testcase:
```
long f(long a, long b, long c)
{
   return a ? b : b & c;
}
```

In GCC 15 (and with this patch) GCC produces on targets with cmov
(aarch64, x86_64 is similar):
```
         and     x2, x1, x2
         cmp     x0, 0
         csel    x0, x2, x1, eq
```

Without we get:
```
         cmp     x0, 0
         csel    x0, x1, xzr, ne
         and     x1, x1, x2
         orr     x0, x1, x0
```

So what we have is targets which want two different approaches to thebasic code generation strategy. Often we'd look to tackle this with acost function. We could do that, but it'd mean one target is going tohave to have combiner patterns (or simplify-rtx adjustments) for thecase where the less efficient sequence works, but could be improved. Those are going to be *fugly* -- been there and you can see the evidencein zicond.md IIRC (assuming I upstreamed that).

We can't really key on an optab as the RISC-V port claims to support ageneralized conditional move via an expander that handles thegeneralized case, generating the appropriate code to handle the limitedconditions as well as canonicalization of operands. Having that patternisn't ideal, but it really helps as a fallback path for ifcvttransformations.

I guess we could synthesize the two styles once, cost them, then usethat result to guide expansions going forward. ie, a prefer_czero vsperfer_cmove kind property then test that in the czero_arith path,punting to the cmove_arith path if there's no benefit to the czero form(or active harm as we see above).


Other ideas?


jeff

Re: [PUSHED 2/2] ifcvt: Move noce_try_cond_zero_arith last

Reply via email to