On Wed, May 31, 2023 at 6:34 AM Andrew Pinski via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > After r14-1014-gc5df248509b489364c573e8, GCC started to emit > directly a zero_extract for `(t1&0x8)!=0`. This introduced > a small regression where ifcvt would not do the ifconversion > as there is now a paradoxical subreg in the dest which > was being rejected. Since paradoxical subreg set the whole > register, we can treat it as the same as a reg in the two places. > > OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.
OK I guess. I vaguely remember SUBREG_PROMOTED_UNSIGNED_P applies to non-paradoxical subregs but I might be swapping things - maybe you remember better and whether that would cause any issues here? Thanks, Richard. > gcc/ChangeLog: > > PR rtl-optimization/110042 > * ifcvt.cc (bbs_ok_for_cmove_arith): Allow paradoxical subregs. > (bb_valid_for_noce_process_p): Strip the subreg for the SET_DEST. > > gcc/testsuite/ChangeLog: > > PR rtl-optimization/110042 > * gcc.target/aarch64/csel_bfx_2.c: New test. > --- > gcc/ifcvt.cc | 14 ++++++---- > gcc/testsuite/gcc.target/aarch64/csel_bfx_2.c | 27 +++++++++++++++++++ > 2 files changed, 36 insertions(+), 5 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/csel_bfx_2.c > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > index 868eda93251..0b180b4568f 100644 > --- a/gcc/ifcvt.cc > +++ b/gcc/ifcvt.cc > @@ -2022,7 +2022,7 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block > bb_b, rtx to_rename) > } > > /* Make sure this is a REG and not some instance > - of ZERO_EXTRACT or SUBREG or other dangerous stuff. > + of ZERO_EXTRACT or non-paradoxical SUBREG or other dangerous stuff. > If we have a memory destination then we have a pair of simple > basic blocks performing an operation of the form [addr] = c ? a : b. > bb_valid_for_noce_process_p will have ensured that these are > @@ -2030,7 +2030,8 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block > bb_b, rtx to_rename) > to be renamed. Assert that the callers set this up properly. */ > if (MEM_P (SET_DEST (sset_b))) > gcc_assert (rtx_equal_p (SET_DEST (sset_b), to_rename)); > - else if (!REG_P (SET_DEST (sset_b))) > + else if (!REG_P (SET_DEST (sset_b)) > + && !paradoxical_subreg_p (SET_DEST (sset_b))) > { > BITMAP_FREE (bba_sets); > return false; > @@ -3136,14 +3137,17 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx > cond, > > rtx sset = single_set (insn); > gcc_assert (sset); > + rtx dest = SET_DEST (sset); > + if (SUBREG_P (dest)) > + dest = SUBREG_REG (dest); > > if (contains_mem_rtx_p (SET_SRC (sset)) > - || !REG_P (SET_DEST (sset)) > - || reg_overlap_mentioned_p (SET_DEST (sset), cond)) > + || !REG_P (dest) > + || reg_overlap_mentioned_p (dest, cond)) > goto free_bitmap_and_fail; > > potential_cost += pattern_cost (sset, speed_p); > - bitmap_set_bit (test_bb_temps, REGNO (SET_DEST (sset))); > + bitmap_set_bit (test_bb_temps, REGNO (dest)); > } > } > > diff --git a/gcc/testsuite/gcc.target/aarch64/csel_bfx_2.c > b/gcc/testsuite/gcc.target/aarch64/csel_bfx_2.c > new file mode 100644 > index 00000000000..c3b8a6f45cc > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/csel_bfx_2.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +unsigned > +f1(int t, int t1) > +{ > + int tt = 0; > + if(t) > + tt = (t1&0x8)!=0; > + return tt; > +} > +struct f > +{ > + unsigned t:3; > + unsigned t1:4; > +}; > +unsigned > +f2(int t, struct f y) > +{ > + int tt = 0; > + if(t) > + tt = y.t1; > + return tt; > +} > +/* Both f1 and f2 should produce a csel and not a cbz on the argument. */ > +/* { dg-final { scan-assembler-times "csel\t" 2 } } */ > +/* { dg-final { scan-assembler-times "ubfx\t" 2 } } */ > +/* { dg-final { scan-assembler-not "cbz\t" } } */ > -- > 2.31.1 >