On Tue, Dec 23, 2025 at 10:54 PM Andrew Pinski
<[email protected]> wrote:
>
> On Tue, Dec 23, 2025 at 8:33 PM Jeffrey Law <[email protected]> wrote:
> >
> >
> > On 12/23/2025 6:44 PM, Andrew Pinski wrote:
> > > I noticed that on x86_64 and aarch64, noce_try_cond_zero_arith
> > > would produce worse code than noce_try_cmove_arith.
> > > So we should do noce_try_cond_zero_arith last instead
> > > of before noce_try_cmove_arith.
> > >
> > > Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
> > > Also checked to make sure riscv testcases still work.
> > >
> > > gcc/ChangeLog:
> > >
> > >       * ifcvt.cc (noce_process_if_block): Move noce_try_cond_zero_arith
> > >       last.
> >
> > Please no.   We very much want to use condzero_arith rather than cmove
> > based things -- that would be pretty bad in general for RISC-V.  We
> > really should dive into why the code isn't as good as we'd like on other
> > patforms.
>
> I looked and I noticed noce_try_cmove_arith fails for riscv for most
> (all?) of the testcases I tried.
> I noticed in some cases noce_convert_multiple_sets happened even
> before noce_try_cond_zero_arith had a chance to do its thing.
> The code generation from noce_convert_multiple_sets is worse for riscv
> for sure and noce_convert_multiple_sets happens even before
> noce_try_cond_zero_arith and noce_try_cmove_arith could even happen.
>
> Note looking into cases where noce_try_cond_zero_arith fails (and
> noce_try_cmove_arith also fails) on riscv I find that the check:
> ```
>   if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx))
>     return false;
> ```
> is too restrictive but that is a different story.
>
> Now on to the question about other targets.
> The question here is `a OP= (cond ? 0 : b)` better than  `a = (cond ?
> a : a OP b)`.
> LLVM seems to always do as `a OP= (cond ? 0 : b)`. (except for & where
> they do `a &= cond ? -1 : b`).
> I think both for aarch64 are ok, for x86_64, I saw a notice that doing
> the conditional move before the operation is better on some
> micro-architectures
>
> BUT the `&` case is worse without this patch.
> testcase:
> ```
> long f(long a, long b, long c)
> {
>   return a ? b : b & c;
> }
> ```
>
> In GCC 15 (and with this patch) GCC produces on targets with cmov
> (aarch64, x86_64 is similar):
> ```
>         and     x2, x1, x2
>         cmp     x0, 0
>         csel    x0, x2, x1, eq
> ```
>
> Without we get:
> ```
>         cmp     x0, 0
>         csel    x0, x1, xzr, ne
>         and     x1, x1, x2
>         orr     x0, x1, x0
> ```
>
> Note LLVM produces for aarch64:
> ```
>         cmp     x0, #0
>         csinv   x8, x2, xzr, eq
>         and     x0, x8, x1
> ```
> and for x86_64:
> ```
>         xor     eax, eax
>         neg     rdi
>         sbb     rax, rax
>         or      rax, rdx
>         and     rax, rsi
> ```

For aarch64 (and similarly on x86_64) we get this in combine:
```
Failed to match this instruction:
(set (reg/i:DI 0 x0)
    (ior:DI (and:DI (reg/v:DI 103 [ bD.4607 ])
            (reg:DI 111 [ cD.4608 ]))
        (if_then_else:DI (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg/v:DI 103 [ bD.4607 ])
            (const_int 0 [0]))))
```
Which in theory could simplify to:
```
(set (reg/i:DI 0 x0)
     (if_then_else:DI (ne (reg:CC 66 cc)
                (const_int 0 [0]))
        (reg/v:DI 103 [ bD.4607 ])
        (and:DI (reg/v:DI 103 [ bD.4607 ])
                (reg:DI 111 [ cD.4608 ]))))
```
Which is similar to what Daniel Barboza is working on for PR122608
(but at the rtl level rather than the gimple level).

Thanks,
Andrew

>
> Thanks,
> Andrew Pinski
>
> >
> >
> > jeff
> >

Reply via email to