https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119982
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|middle-end |rtl-optimization See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=119997 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- The change is that we now hoist _1 = &v_6(D)->a; _2 = &v_6(D)->b; out of the loop in anticipation that PRE would have done the same which it does not because of how we handle &v_6(D)->a in VN. This restores an optimization broken earlier (I filed PR119997) for this. The testcase was added just because of that "fix" without further analysis. The simple analysis is that this is TER at play and the initial RTL expansion without the hoisting is 7: r103:DI=[r102:DI] 8: [scratch]=unspec[[scratch]] 17 9: r100:DI=[r102:DI+0x8] 10: [scratch]=unspec[[scratch]] 17 while otherwise we get (TER does not work across BBs) 6: r98:DI=r102:DI 7: {r99:DI=r102:DI+0x8;clobber flags:CC;} 13: L13: 8: NOTE_INSN_BASIC_BLOCK 4 9: r103:DI=[r98:DI] 10: [scratch]=unspec[[scratch]] 17 11: r100:DI=[r99:DI] 12: [scratch]=unspec[[scratch]] 17 and nothing in the RTL pipeline sees to "fix" this - forwprop or now late_combine comes to my mind (RTL combine doesn't work across BBs either). So basically this bug is a duplicate of PR109362 itself, the effect of PR119997 on the testcase has been fixed by r16-190-g6901d56fea2132, so the latent issue is here again. Adjusted testcase which was never "fixed": struct S { long a, b; }; int foo (struct S *v) { long *x = &v->a; long *y = &v->b; while (1) { __atomic_load_n (x, __ATOMIC_ACQUIRE); if (__atomic_load_n (y, __ATOMIC_ACQUIRE)) return 1; } }