https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119982

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|middle-end                  |rtl-optimization
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=119997

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The change is that we now hoist

  _1 = &v_6(D)->a;
  _2 = &v_6(D)->b;

out of the loop in anticipation that PRE would have done the same which
it does not because of how we handle &v_6(D)->a in VN.  This restores
an optimization broken earlier (I filed PR119997) for this.  The testcase
was added just because of that "fix" without further analysis.

The simple analysis is that this is TER at play and the initial RTL
expansion without the hoisting is

    7: r103:DI=[r102:DI]
    8: [scratch]=unspec[[scratch]] 17
    9: r100:DI=[r102:DI+0x8]
   10: [scratch]=unspec[[scratch]] 17

while otherwise we get (TER does not work across BBs)

    6: r98:DI=r102:DI
    7: {r99:DI=r102:DI+0x8;clobber flags:CC;}
   13: L13:
    8: NOTE_INSN_BASIC_BLOCK 4
    9: r103:DI=[r98:DI]
   10: [scratch]=unspec[[scratch]] 17
   11: r100:DI=[r99:DI]
   12: [scratch]=unspec[[scratch]] 17

and nothing in the RTL pipeline sees to "fix" this - forwprop or
now late_combine comes to my mind (RTL combine doesn't work across
BBs either).

So basically this bug is a duplicate of PR109362 itself, the effect of
PR119997 on the testcase has been fixed by r16-190-g6901d56fea2132,
so the latent issue is here again.

Adjusted testcase which was never "fixed":

struct S { long a, b; };

int
foo (struct S *v)
{
  long *x = &v->a;
  long *y = &v->b;
  while (1)
    {
      __atomic_load_n (x, __ATOMIC_ACQUIRE);
      if (__atomic_load_n (y, __ATOMIC_ACQUIRE))
        return 1;
    }
}

Reply via email to