Re: [PUSHED 2/2] ifcvt: Move noce_try_cond_zero_arith last

Jeffrey Law Fri, 26 Dec 2025 12:51:25 -0800


On 12/26/2025 1:32 PM, Andrew Pinski wrote:

On Fri, Dec 26, 2025 at 11:17 AM Jeffrey Law <[email protected]> wrote:


On 12/23/2025 11:54 PM, Andrew Pinski wrote:

On Tue, Dec 23, 2025 at 8:33 PM Jeffrey Law <[email protected]> wrote:

On 12/23/2025 6:44 PM, Andrew Pinski wrote:

I noticed that on x86_64 and aarch64, noce_try_cond_zero_arith
would produce worse code than noce_try_cmove_arith.
So we should do noce_try_cond_zero_arith last instead
of before noce_try_cmove_arith.

Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
Also checked to make sure riscv testcases still work.

gcc/ChangeLog:

        * ifcvt.cc (noce_process_if_block): Move noce_try_cond_zero_arith
        last.

Please no.   We very much want to use condzero_arith rather than cmove
based things -- that would be pretty bad in general for RISC-V.  We
really should dive into why the code isn't as good as we'd like on other
patforms.

I looked and I noticed noce_try_cmove_arith fails for riscv for most
(all?) of the testcases I tried.

We may not have good coverage here.   But it was a huge source of
performance issues with gcc-15 -- way too many generalized conditional
moves that should have been condzero style sequences.

I noticed in some cases noce_convert_multiple_sets happened even
before noce_try_cond_zero_arith had a chance to do its thing.
The code generation from noce_convert_multiple_sets is worse for riscv
for sure and noce_convert_multiple_sets happens even before
noce_try_cond_zero_arith and noce_try_cmove_arith could even happen.

Note looking into cases where noce_try_cond_zero_arith fails (and
noce_try_cmove_arith also fails) on riscv I find that the check:
```
    if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx))
      return false;
```
is too restrictive but that is a different story.

But that's capturing the key concept.  Namely that we want to target a
conditional zero idiom rather than a conditional move idiom.  RISC-V has
instructions for the former.  The latter requires a pair of conditional
zeros with opposite polarity on the test *and* an additional instruction
to select one of the outputs from the pair of conditional zeros.

I understand there are riscv instructions which handle `a!=0?b:c` but
the reason why I say it was too restrive was
because I noticed if we had:
```
long
test_ADD_ceqz_x (long x, long z, long c)
{
   if (c < 10)
     x = x + z;

   return x;
}
```
This would not cause a czero.eqz to be used.
I assumed we wanted here:
```
         slti    a2,a2,10
         czero.eqz       a2,a1,a2
         add     a0,a0,a2
         ret
```
But we get the branch version.
If I remove the check for REG/0, then I get the above.

Yea. I'll need to look at why that gets converted with our internaltree, but not with the trunk. Internally I get:


        li      a5,9
        sgt     a2,a2,a5
        czero.nez       a2,a1,a2
        add     a0,a0,a2

Not perfect, but no branch ;-)

To get it if-converted upstream we need to crank up the branch cost andwe get a pesky conditional move based sequence:


        li      a5,9
        sgt     a2,a2,a5
        add     a1,a0,a1
        czero.nez       a1,a1,a2
        czero.eqz       a0,a0,a2
        add     a0,a0,a1

The combination of the two would tend to suggest it's mess of code wehave internally for condzero_arith that's capturing more cases. Not ahuge surprise given the focus we put on that last spring. As you said,it's a separate issue and we definitely want to address.

One idea is for the `AND` case, try to see if the conditional move
will handle `cond?a:-1` first. That would fixup the one case which
seems like the only really bad code generation.
I think having the other operation before or after the cmove should be
ok either way and would fix up a different regression.

That's basically what I was thinking to test once from a costingstandpoint and caching the result. Essentially having condzero_arithask the target (via generating RTL and querying how the target handlesit, either from a sequence length, cost, whatever) what's preferred. Iwas going to use the AND case to drive the decision since that seems tobe where the pain point shows up, and use the result of the AND as aproxy for all the cases where we want to avoid condzero_arith andinstead utilize condmove_arith.


Jeff

Re: [PUSHED 2/2] ifcvt: Move noce_try_cond_zero_arith last

Reply via email to