https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334

--- Comment #42 from Richard Biener <rguenth at gcc dot gnu.org> ---
With the proposed patch (with a tiny fix) I get:

(compute_affine_dependence
  stmt_a: _23 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_20];
  stmt_b: _112 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_111];
(analyze_overlapping_iterations
  (chrec_a = {pretmp_822 + 2, +, 1}_6)
  (chrec_b = {pretmp_850 + 1, +, 1}_6)
(analyze_siv_subscript
  siv test failed: unimplemented)
  (overlap_iterations_a = not known)
  (overlap_iterations_b = not known))
) -> dependence analysis failed

but the cases like the following work

(compute_affine_dependence
  stmt_a: _23 = MEM[(real(kind=8)[0:D.2444] *)&x clique 1 base 4][_20];
  stmt_b: _110 = MEM[(real(kind=8)[4] *)&x + 58071104B clique 1 base 3][3];
) -> no dependence

and finally

mgrid.f:191:0: note: LOOP VECTORIZED

mgrid.f:184:0: note: vectorized 1 loops in function.

and speed is back!

unpatched:

> /usr/bin/time ./a.out < mgrid.in > /dev/null
66.72user 0.05system 1:06.76elapsed 100%CPU (0avgtext+0avgdata
57804maxresident)k
0inputs+8360outputs (0major+14505minor)pagefaults 0swaps

patched:

> /usr/bin/time ./a.out < mgrid.in > /dev/null
61.90user 0.13system 1:02.16elapsed 99%CPU (0avgtext+0avgdata
57804maxresident)k
1472inputs+0outputs (9major+14496minor)pagefaults 0swaps

not sure if that is full recovery (but it's 10% and thus noticable).

I'll post the updated patch which still lacks the correctness part in the
inliner.

Reply via email to