https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187
--- Comment #28 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Earnshaw from comment #26) > git bisect points to commit r11-9688 resolving the issue. Before that > commit the ivopts pass generates: > > ivtmp.761_217 = (unsigned int) &au; > _222 = &bu + 4; > ivtmp.767_220 = (unsigned int) _222; > _225 = (unsigned int) &au; > _228 = _225 + 16; > > <bb 9> [local count: 858993457]: > # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)> > # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)> > # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)> > # ivtmp.767_218 = PHI <ivtmp.767_219(10), ivtmp.767_220(8)> > _16 = prephitmp_32 ^ prephitmp_136; > _223 = (void *) ivtmp.761_278; > MEM[(unsigned int *)_223] = _16; > ivtmp.761_216 = ivtmp.761_278 + 4; > if (ivtmp.761_216 != _228) > goto <bb 10>; [75.00%] > else > goto <bb 11>; [25.00%] > > <bb 10> [local count: 644245086]: > _230 = (void *) ivtmp.761_216; > pretmp_120 = MEM[(unsigned int *)_230]; > _229 = (void *) ivtmp.767_218; > pretmp_18 = MEM[(unsigned int *)_229]; > ivtmp.767_219 = ivtmp.767_218 + 4; > goto <bb 9>; [100.00%] > > And once that patch is applied we get: > > ivtmp.761_217 = (unsigned int) &au; > ivtmp.766_220 = (unsigned int) &bu; > _223 = (unsigned int) &au; > _225 = _223 + 16; > > <bb 9> [local count: 858993457]: > # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)> > # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)> > # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)> > # ivtmp.766_218 = PHI <ivtmp.766_219(10), ivtmp.766_220(8)> > _16 = prephitmp_32 ^ prephitmp_136; > _222 = (void *) ivtmp.761_278; > MEM[(unsigned int *)_222] = _16; > ivtmp.761_216 = ivtmp.761_278 + 4; > if (ivtmp.761_216 != _225) > goto <bb 10>; [75.00%] > else > goto <bb 11>; [25.00%] > > The main difference being that in the 'bad' code we start with &bu + 4, > while in the good code we start with &bu. > > I'm afraid I don't know enough about this code to take this further. Richi? There's no functional difference, you omitted BB9 after the patch which for me looks like <bb 10> [local count: 644245086]: # PT = { D.22767 } _228 = (voidD.73 *) ivtmp.741_281; [t.ii:2167:17] pretmp_155 = MEM[(unsigned intD.11 *)_228]; [t.ii:2167:26] ivtmp.746_28 = ivtmp.746_299 + 4; # PT = { D.22768 } _227 = (voidD.73 *) ivtmp.746_28; [t.ii:2167:26] pretmp_183 = MEM[(unsigned intD.11 *)_227]; goto <bb 9>; [100.00%] so we changed from post-increment to pre-increment of 4 - the accesses happen to the same memory location. I'm dumping with -alias-uid-lineno and alias info looks fine to me here. It might very well be that the change above triggers a bug elsewhere. Does reverting the "fixing" revision make the issue appear on trunk as well? The code at RTL expansion time looks reasonable (also from an aliasing POV), if -fno-strict-aliasing fixes it, does -fno-schedule-insn{,2} also?