On Fri, 28 Mar 2025, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
> We have from earlier combinations
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
>         (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
>      (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
>         (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
>      (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
>         (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (parallel [
>             (set (reg:CCZ 17 flags)
>                 (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
>                     (const_int 0 [0])))
>             (set (reg/v:SI 116 [ e ])
>                 (neg:SI (reg:SI 104 [ _7 ])))
>         ]) "pr119291.c":26:13 977 {*negsi_2}
>      (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
>         (nil)))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
>         (ne:DI (reg:CCZ 17 flags)
>             (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
>      (expr_list:REG_DEAD (reg:CCZ 17 flags)
>         (nil)))
> and try_combine is called on i3 25 and i2 22 (second time)
> and reach the hunk being patched with simplified i3
> (insn 25 24 26 4 (parallel [
>             (set (pc)
>                 (pc))
>             (set (reg/v:SI 116 [ e ])
>                 (const_int 0 [0]))
>         ]) "pr119291.c":28:13 977 {*negsi_2}
>      (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
>         (nil)))
> and
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
>         (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
>      (nil))
> Now, the try_combine code there attempts to split two independent
> sets in newpat by moving one of them to i2.
> And among other tests it checks
> !modified_between_p (SET_DEST (set1), i2, i3)
> which is certainly needed, if there would be say
> (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
> in between i2 and i3, we couldn't do that, as that set would overwrite
> the value set by set1 we want to move to the i2 position.
> But in this case pseudo 116 isn't set in between i2 and i3, but used
> (and additionally there is a REG_DEAD note for it).
> 
> This is equally bad for the move, because while the i3 insn
> and later will see the pseudo value that we set, the insn in between
> which uses the value will see a different value from the one that
> it should see.
> 
> As we don't check for that, in the end try_combine succeeds and
> changes the IL to:
> (insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
>         (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
>      (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
>         (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
>      (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
>         (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (set (pc)
>         (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
>      (nil))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
>         (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
>      (nil))
> (note, the i3 got turned into a nop and try_combine also modified insn 27).
> 
> The following patch replaces the modified_between_p
> tests with reg_used_between_p, my understanding is that
> modified_between_p is a subset of reg_used_between_p, so one
> doesn't need both.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux,
> powerpc64le-linux and s390x-linux, ok for trunk?
> 
> Looking at this some more today, I think we should special case
> set_noop_p because that can be put into i2 (except for the JUMP_P
> violations), currently both modified_between_p (pc_rtx, i2, i3)
> and reg_used_between_p (pc_rtx, i2, i3) returns false.
> I'll post a patch incrementally for that (but that feels like
> new optimization, so probably not something that should be backported).

Trying to review this I noticed that both the comment in combine suggests
that memory set is important and modified_between_p handles MEM_P
'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
reg_used_between_p only handles REG_P 'reg'.  Also shouldn't
you use reg_referenced_p?  At least that function seems to handle
SET_SRC/SET_DEST separately?

Can we constrain SET_DEST (set1/set0) to a REG_P in combine?  Why
does the comment talk about memory?

> 2025-03-28  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR rtl-optimization/119291
>       * combine.cc (try_combine): For splitting of PARALLEL with
>       2 independent SETs into i2 and i3 sets check reg_used_between_p
>       of the SET_DESTs rather than just modified_between_p.
> 
>       * gcc.c-torture/execute/pr119291.c: New test.
> 
> --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
> +++ gcc/combine.cc    2025-03-27 09:50:15.768567383 +0100
> @@ -4012,18 +4012,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
>        rtx set1 = XVECEXP (newpat, 0, 1);
>  
>        /* Normally, it doesn't matter which of the two is done first, but
> -      one which uses any regs/memory set in between i2 and i3 can't
> -      be first.  The PARALLEL might also have been pre-existing in i3,
> -      so we need to make sure that we won't wrongly hoist a SET to i2
> -      that would conflict with a death note present in there, or would
> -      have its dest modified between i2 and i3.  */
> +      one which uses any regs/memory set or used in between i2 and i3
> +      can't be first.  The PARALLEL might also have been pre-existing
> +      in i3, so we need to make sure that we won't wrongly hoist a SET
> +      to i2 that would conflict with a death note present in there, or
> +      would have its dest modified or used between i2 and i3.  */
>        if (!modified_between_p (SET_SRC (set1), i2, i3)
>         && !(REG_P (SET_DEST (set1))
>              && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
>         && !(GET_CODE (SET_DEST (set1)) == SUBREG
>              && find_reg_note (i2, REG_DEAD,
>                                SUBREG_REG (SET_DEST (set1))))
> -       && !modified_between_p (SET_DEST (set1), i2, i3)
> +       && !reg_used_between_p (SET_DEST (set1), i2, i3)
>         /* If I3 is a jump, ensure that set0 is a jump so that
>            we do not create invalid RTL.  */
>         && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
> @@ -4038,7 +4038,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
>              && !(GET_CODE (SET_DEST (set0)) == SUBREG
>                   && find_reg_note (i2, REG_DEAD,
>                                     SUBREG_REG (SET_DEST (set0))))
> -            && !modified_between_p (SET_DEST (set0), i2, i3)
> +            && !reg_used_between_p (SET_DEST (set0), i2, i3)
>              /* If I3 is a jump, ensure that set1 is a jump so that
>                 we do not create invalid RTL.  */
>              && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
> --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 
> 09:48:01.917407084 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c    2025-03-27 
> 09:47:48.020598094 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/119291 */
> +
> +int a;
> +long c;
> +
> +__attribute__((noipa)) void
> +foo (int x)
> +{
> +  if (x != 0)
> +    __builtin_abort ();
> +  a = 42;
> +}
> +
> +int
> +main ()
> +{
> +  int e = 1;
> +lab:
> +  if (a < 2)
> +    {
> +      int b = e;
> +      _Bool d = a != 0;
> +      _Bool f = b != 0;
> +      unsigned long g = -(d & f);
> +      unsigned long h = c & g;
> +      unsigned long i = ~c;
> +      e = -(i & h);
> +      c = e != 0;
> +      a = ~e + b;
> +      foo (e);
> +      goto lab;
> +    }
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to