On Tue, 1 Apr 2025, Jakub Jelinek wrote:

> On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> > > Looking at this some more today, I think we should special case
> > > set_noop_p because that can be put into i2 (except for the JUMP_P
> > > violations), currently both modified_between_p (pc_rtx, i2, i3)
> > > and reg_used_between_p (pc_rtx, i2, i3) returns false.
> > > I'll post a patch incrementally for that (but that feels like
> > > new optimization, so probably not something that should be backported).
> > 
> > Trying to review this I noticed that both the comment in combine suggests
> > that memory set is important and modified_between_p handles MEM_P
> > 'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
> > reg_used_between_p only handles REG_P 'reg'.  Also shouldn't
> > you use reg_referenced_p?  At least that function seems to handle
> > SET_SRC/SET_DEST separately?
> > 
> > Can we constrain SET_DEST (set1/set0) to a REG_P in combine?  Why
> > does the comment talk about memory?
> 
> I was worried about making too risky changes this late in stage4
> (and especially also for backports).  Most of this code is 1992-ish.
> I think many of the functions are just misnamed, the reg_ in there doesn't
> match what those functions do (bet they initially supported just REGs
> and later on support for other kinds of expressions was added, but haven't
> done git archeology to prove that).
> 
> What we know for sure is:
>            && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
>            && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
>            && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
>            && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
> that is checked earlier in the condition.
> Then it calls
>            && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
>                                   XVECEXP (newpat, 0, 0))
>            && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
>                                   XVECEXP (newpat, 0, 1))
> While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
> which is also misnamed, that function handles just fine all of
> REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
> STRICT_LOW_PART, PC and even some further cases.
> So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
> of REG, PC (at least the REG and PC cases are triggered on the testcase)
> and quite possibly also MEM (SUBREG of MEM not, see below).
> 
> Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
> function for constants just returns false, for PC returns true, for REG
> returns reg_set_between_p, for MEM recurses on the address, for
> MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
> whether the memory could have been modified in between, for all other
> rtxes recurses on the subrtxes.  This part didn't change in my patch.
> 
> I've only changed those
> -         && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
> +         && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
> where the former has been described above and clearly handles all of
> REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.
> 
> The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
> instruction in between i2 and i3.  So, there is clearly a difference
> in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
> returns unconditionally true even if there are no instructions in between,
> but reg_used_between_p if there are no non-debug insns in between returns
> false.  Sorry for missing that, guess I should check for that (with the
> exception of the noop moves which are often (set (pc) (pc)) and handled
> by the incremental patch).  In fact not just that, reg_used_between_p
> will only return true for PC if it is mentioned anywhere in the insns
> in between.
> Anyway, except for that, for REG it calls refers_to_regno_p
> and so should find any occurrences of any of the REG or parts of it for hard
> registers, for MEM returns true if it sees any MEMs in insns in between
> (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
> (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
> SUBREG_REG, PC I've already described.
> 
> Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
> already the initial
>            && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
>                                   XVECEXP (newpat, 0, 0))
>            && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
>                                   XVECEXP (newpat, 0, 1))
> calls would have failed --enable-checking=rtl or would have misbehaved, so
> I think there is no need to check for it further.
> 
> To your question why I don't use reg_referenced_p, that is because
> reg_referenced_p is something to call on one insn pattern, while
> reg_used_between_p is pretty much that on all insns in between two
> instructions (excluding the boundaries).
> 
> So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
> checks to preserve former behavior, like in the following version.

OK.

Richard.

> 2025-04-01  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR rtl-optimization/119291
>       * combine.cc (try_combine): For splitting of PARALLEL with
>       2 independent SETs into i2 and i3 sets check reg_used_between_p
>       of the SET_DESTs rather than just modified_between_p.
> 
>       * gcc.c-torture/execute/pr119291.c: New test.
> 
> --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
> +++ gcc/combine.cc    2025-04-01 12:24:25.956711734 +0200
> @@ -4012,18 +4012,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
>        rtx set1 = XVECEXP (newpat, 0, 1);
>  
>        /* Normally, it doesn't matter which of the two is done first, but
> -      one which uses any regs/memory set in between i2 and i3 can't
> -      be first.  The PARALLEL might also have been pre-existing in i3,
> -      so we need to make sure that we won't wrongly hoist a SET to i2
> -      that would conflict with a death note present in there, or would
> -      have its dest modified between i2 and i3.  */
> +      one which uses any regs/memory set or used in between i2 and i3
> +      can't be first.  The PARALLEL might also have been pre-existing
> +      in i3, so we need to make sure that we won't wrongly hoist a SET
> +      to i2 that would conflict with a death note present in there, or
> +      would have its dest modified or used between i2 and i3.  */
>        if (!modified_between_p (SET_SRC (set1), i2, i3)
>         && !(REG_P (SET_DEST (set1))
>              && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
>         && !(GET_CODE (SET_DEST (set1)) == SUBREG
>              && find_reg_note (i2, REG_DEAD,
>                                SUBREG_REG (SET_DEST (set1))))
> -       && !modified_between_p (SET_DEST (set1), i2, i3)
> +       && SET_DEST (set1) != pc_rtx
> +       && !reg_used_between_p (SET_DEST (set1), i2, i3)
>         /* If I3 is a jump, ensure that set0 is a jump so that
>            we do not create invalid RTL.  */
>         && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
> @@ -4038,7 +4039,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
>              && !(GET_CODE (SET_DEST (set0)) == SUBREG
>                   && find_reg_note (i2, REG_DEAD,
>                                     SUBREG_REG (SET_DEST (set0))))
> -            && !modified_between_p (SET_DEST (set0), i2, i3)
> +            && SET_DEST (set0) != pc_rtx
> +            && !reg_used_between_p (SET_DEST (set0), i2, i3)
>              /* If I3 is a jump, ensure that set1 is a jump so that
>                 we do not create invalid RTL.  */
>              && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
> --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 
> 09:48:01.917407084 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c    2025-03-27 
> 09:47:48.020598094 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/119291 */
> +
> +int a;
> +long c;
> +
> +__attribute__((noipa)) void
> +foo (int x)
> +{
> +  if (x != 0)
> +    __builtin_abort ();
> +  a = 42;
> +}
> +
> +int
> +main ()
> +{
> +  int e = 1;
> +lab:
> +  if (a < 2)
> +    {
> +      int b = e;
> +      _Bool d = a != 0;
> +      _Bool f = b != 0;
> +      unsigned long g = -(d & f);
> +      unsigned long h = c & g;
> +      unsigned long i = ~c;
> +      e = -(i & h);
> +      c = e != 0;
> +      a = ~e + b;
> +      foo (e);
> +      goto lab;
> +    }
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to