https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112331
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|middle-end: Fail |Fail vectorization after
|vectorization |loop interchange
CC| |rguenth at gcc dot gnu.org
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, the "issue" is that we are performing loop interchange on this benchmark
loop and the vectorizer doesn't like the zero-step in the then innermost loop.
It's not a practical example, nobody would do such outer loop in practice.
There's a missed optimization in that we fail to elide the then inner loop.
The solution is to insert a use of 'a' after the inner loop, like TSVC
benchmarks usually have:
real_t s111(struct args_t * func_args)
{
// linear dependence testing
// no dependence - vectorizable
initialise_arrays(__func__);
for (int nl = 0; nl < 2*iterations; nl++) {
for (int i = 1; i < LEN_1D; i += 2) {
a[i] = a[i - 1] + b[i];
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
}
return calc_checksum(__func__);
}
the it just works(TM).
WONTFIX (in the vectorizer). In "theory" the interchanged loop could be
vectorized by outer loop vectorization. But as said, IMHO a waste of time
to cheat badly written benchmarks.