https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334
--- Comment #42 from Richard Biener <rguenth at gcc dot gnu.org> ---
With the proposed patch (with a tiny fix) I get:
(compute_affine_dependence
stmt_a: _23 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_20];
stmt_b: _112 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_111];
(analyze_overlapping_iterations
(chrec_a = {pretmp_822 + 2, +, 1}_6)
(chrec_b = {pretmp_850 + 1, +, 1}_6)
(analyze_siv_subscript
siv test failed: unimplemented)
(overlap_iterations_a = not known)
(overlap_iterations_b = not known))
) -> dependence analysis failed
but the cases like the following work
(compute_affine_dependence
stmt_a: _23 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_20];
stmt_b: _110 = MEM[(real(kind=8)[4] *)&x + 58071104B clique 1 base 3][3];
) -> no dependence
and finally
mgrid.f:191:0: note: LOOP VECTORIZED
mgrid.f:184:0: note: vectorized 1 loops in function.
and speed is back!
unpatched:
> /usr/bin/time ./a.out < mgrid.in > /dev/null
66.72user 0.05system 1:06.76elapsed 100%CPU (0avgtext+0avgdata
57804maxresident)k
0inputs+8360outputs (0major+14505minor)pagefaults 0swaps
patched:
> /usr/bin/time ./a.out < mgrid.in > /dev/null
61.90user 0.13system 1:02.16elapsed 99%CPU (0avgtext+0avgdata
57804maxresident)k
1472inputs+0outputs (9major+14496minor)pagefaults 0swaps
not sure if that is full recovery (but it's 10% and thus noticable).
I'll post the updated patch which still lacks the correctness part in the
inliner.