On Tue, Dec 23, 2025 at 10:54 PM Andrew Pinski
<[email protected]> wrote:
>
> On Tue, Dec 23, 2025 at 8:33 PM Jeffrey Law <[email protected]> wrote:
> >
> >
> > On 12/23/2025 6:44 PM, Andrew Pinski wrote:
> > > I noticed that on x86_64 and aarch64, noce_try_cond_zero_arith
> > > would produce worse code than noce_try_cmove_arith.
> > > So we should do noce_try_cond_zero_arith last instead
> > > of before noce_try_cmove_arith.
> > >
> > > Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
> > > Also checked to make sure riscv testcases still work.
> > >
> > > gcc/ChangeLog:
> > >
> > > * ifcvt.cc (noce_process_if_block): Move noce_try_cond_zero_arith
> > > last.
> >
> > Please no. We very much want to use condzero_arith rather than cmove
> > based things -- that would be pretty bad in general for RISC-V. We
> > really should dive into why the code isn't as good as we'd like on other
> > patforms.
>
> I looked and I noticed noce_try_cmove_arith fails for riscv for most
> (all?) of the testcases I tried.
> I noticed in some cases noce_convert_multiple_sets happened even
> before noce_try_cond_zero_arith had a chance to do its thing.
> The code generation from noce_convert_multiple_sets is worse for riscv
> for sure and noce_convert_multiple_sets happens even before
> noce_try_cond_zero_arith and noce_try_cmove_arith could even happen.
>
> Note looking into cases where noce_try_cond_zero_arith fails (and
> noce_try_cmove_arith also fails) on riscv I find that the check:
> ```
> if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx))
> return false;
> ```
> is too restrictive but that is a different story.
>
> Now on to the question about other targets.
> The question here is `a OP= (cond ? 0 : b)` better than `a = (cond ?
> a : a OP b)`.
> LLVM seems to always do as `a OP= (cond ? 0 : b)`. (except for & where
> they do `a &= cond ? -1 : b`).
> I think both for aarch64 are ok, for x86_64, I saw a notice that doing
> the conditional move before the operation is better on some
> micro-architectures
>
> BUT the `&` case is worse without this patch.
> testcase:
> ```
> long f(long a, long b, long c)
> {
> return a ? b : b & c;
> }
> ```
>
> In GCC 15 (and with this patch) GCC produces on targets with cmov
> (aarch64, x86_64 is similar):
> ```
> and x2, x1, x2
> cmp x0, 0
> csel x0, x2, x1, eq
> ```
>
> Without we get:
> ```
> cmp x0, 0
> csel x0, x1, xzr, ne
> and x1, x1, x2
> orr x0, x1, x0
> ```
>
> Note LLVM produces for aarch64:
> ```
> cmp x0, #0
> csinv x8, x2, xzr, eq
> and x0, x8, x1
> ```
> and for x86_64:
> ```
> xor eax, eax
> neg rdi
> sbb rax, rax
> or rax, rdx
> and rax, rsi
> ```
For aarch64 (and similarly on x86_64) we get this in combine:
```
Failed to match this instruction:
(set (reg/i:DI 0 x0)
(ior:DI (and:DI (reg/v:DI 103 [ bD.4607 ])
(reg:DI 111 [ cD.4608 ]))
(if_then_else:DI (ne (reg:CC 66 cc)
(const_int 0 [0]))
(reg/v:DI 103 [ bD.4607 ])
(const_int 0 [0]))))
```
Which in theory could simplify to:
```
(set (reg/i:DI 0 x0)
(if_then_else:DI (ne (reg:CC 66 cc)
(const_int 0 [0]))
(reg/v:DI 103 [ bD.4607 ])
(and:DI (reg/v:DI 103 [ bD.4607 ])
(reg:DI 111 [ cD.4608 ]))))
```
Which is similar to what Daniel Barboza is working on for PR122608
(but at the rtl level rather than the gimple level).
Thanks,
Andrew
>
> Thanks,
> Andrew Pinski
>
> >
> >
> > jeff
> >