On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> > Looking at this some more today, I think we should special case
> > set_noop_p because that can be put into i2 (except for the JUMP_P
> > violations), currently both modified_between_p (pc_rtx, i2, i3)
> > and reg_used_between_p (pc_rtx, i2, i3) returns false.
> > I'll post a patch incrementally for that (but that feels like
> > new optimization, so probably not something that should be backported).
> 
> Trying to review this I noticed that both the comment in combine suggests
> that memory set is important and modified_between_p handles MEM_P
> 'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
> reg_used_between_p only handles REG_P 'reg'.  Also shouldn't
> you use reg_referenced_p?  At least that function seems to handle
> SET_SRC/SET_DEST separately?
> 
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine?  Why
> does the comment talk about memory?

I was worried about making too risky changes this late in stage4
(and especially also for backports).  Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).

What we know for sure is:
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).

Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes.  This part didn't change in my patch.

I've only changed those
-         && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+         && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.

The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3.  So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false.  Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch).  In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.

Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.

To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).

So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.

2025-04-01  Jakub Jelinek  <ja...@redhat.com>

        PR rtl-optimization/119291
        * combine.cc (try_combine): For splitting of PARALLEL with
        2 independent SETs into i2 and i3 sets check reg_used_between_p
        of the SET_DESTs rather than just modified_between_p.

        * gcc.c-torture/execute/pr119291.c: New test.

--- gcc/combine.cc.jj   2025-03-25 09:34:33.469102343 +0100
+++ gcc/combine.cc      2025-04-01 12:24:25.956711734 +0200
@@ -4012,18 +4012,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
       rtx set1 = XVECEXP (newpat, 0, 1);
 
       /* Normally, it doesn't matter which of the two is done first, but
-        one which uses any regs/memory set in between i2 and i3 can't
-        be first.  The PARALLEL might also have been pre-existing in i3,
-        so we need to make sure that we won't wrongly hoist a SET to i2
-        that would conflict with a death note present in there, or would
-        have its dest modified between i2 and i3.  */
+        one which uses any regs/memory set or used in between i2 and i3
+        can't be first.  The PARALLEL might also have been pre-existing
+        in i3, so we need to make sure that we won't wrongly hoist a SET
+        to i2 that would conflict with a death note present in there, or
+        would have its dest modified or used between i2 and i3.  */
       if (!modified_between_p (SET_SRC (set1), i2, i3)
          && !(REG_P (SET_DEST (set1))
               && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
          && !(GET_CODE (SET_DEST (set1)) == SUBREG
               && find_reg_note (i2, REG_DEAD,
                                 SUBREG_REG (SET_DEST (set1))))
-         && !modified_between_p (SET_DEST (set1), i2, i3)
+         && SET_DEST (set1) != pc_rtx
+         && !reg_used_between_p (SET_DEST (set1), i2, i3)
          /* If I3 is a jump, ensure that set0 is a jump so that
             we do not create invalid RTL.  */
          && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
@@ -4038,7 +4039,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
               && !(GET_CODE (SET_DEST (set0)) == SUBREG
                    && find_reg_note (i2, REG_DEAD,
                                      SUBREG_REG (SET_DEST (set0))))
-              && !modified_between_p (SET_DEST (set0), i2, i3)
+              && SET_DEST (set0) != pc_rtx
+              && !reg_used_between_p (SET_DEST (set0), i2, i3)
               /* If I3 is a jump, ensure that set1 is a jump so that
                  we do not create invalid RTL.  */
               && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
--- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj   2025-03-27 
09:48:01.917407084 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr119291.c      2025-03-27 
09:47:48.020598094 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/119291 */
+
+int a;
+long c;
+
+__attribute__((noipa)) void
+foo (int x)
+{
+  if (x != 0)
+    __builtin_abort ();
+  a = 42;
+}
+
+int
+main ()
+{
+  int e = 1;
+lab:
+  if (a < 2)
+    {
+      int b = e;
+      _Bool d = a != 0;
+      _Bool f = b != 0;
+      unsigned long g = -(d & f);
+      unsigned long h = c & g;
+      unsigned long i = ~c;
+      e = -(i & h);
+      c = e != 0;
+      a = ~e + b;
+      foo (e);
+      goto lab;
+    }
+}

        Jakub

Reply via email to