https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112281
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> --- So the aggregate vs part of aggregate access is to confuse the cost modeling (prevent merging of the partitions due to shared memory references) only. With a GIMPLE testcase and commenting out the cost model in the source the following inner loop fails the same way: __BB(4,loop_header(2),guessed_local(955630224)): _20 = __PHI (__BB4: _2, __BB3: 0); _22 = d[_11].a; b.a = _22; _23 = b.a; d[_21].a = _23; // ^ the two partitions v d[_11].a = 0; _2 = _20 + 1; if (_2 <= 1) goto __BB4(guessed(119453778)); else goto __BB5(guessed(14763950)); we're still computing distance_vector: 1 0 direction_vector: + = (gdb) p $14->reversed_p $17 = true for the dependence of d[_21].a = _23; vs. d[_11].a = 0; (for _22 = d[_11].a; vs. d[_21].a = _23; we compute the same but not reversed) The same testcase is also broken with just forward evolving indices: struct { int : 8; int a; } b, d[4] = {{5}, {0}, {0}, {0}}; int c, e; int main() { for (c = 0; c < 2; c++) for (e = 0; e < 2; e++) { d[c + 1] = b = d[c]; d[c].a = 0; } if (b.a != 0) __builtin_abort(); return 0; } And despite that I had to revert the patch the issue _is_ the conflict in the inner loop where we have the zero distance. We're re-ordering d[c + 1] = b = d[c]; d[c].a = 0; (A) d[c + 1] = b = d[c]; (B) d[c].a = 0; as d[c + 1] = b = d[c]; d[c + 1] = b = d[c]; d[c].a = 0; d[c].a = 0; breaking the dependence between (A) d[c].a = 0; and (B) b = d[c].a; Note there's a dependence in both directions in the unrolled form.