https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112331

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|middle-end: Fail            |Fail vectorization after
                   |vectorization               |loop interchange
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, the "issue" is that we are performing loop interchange on this benchmark
loop and the vectorizer doesn't like the zero-step in the then innermost loop.

It's not a practical example, nobody would do such outer loop in practice.

There's a missed optimization in that we fail to elide the then inner loop.

The solution is to insert a use of 'a' after the inner loop, like TSVC
benchmarks usually have:

real_t s111(struct args_t * func_args)
{
//    linear dependence testing
//    no dependence - vectorizable

    initialise_arrays(__func__);

    for (int nl = 0; nl < 2*iterations; nl++) {
        for (int i = 1; i < LEN_1D; i += 2) {
            a[i] = a[i - 1] + b[i];
        }
        dummy(a, b, c, d, e, aa, bb, cc, 0.);
    }

    return calc_checksum(__func__);
}

the it just works(TM).

WONTFIX (in the vectorizer).  In "theory" the interchanged loop could be
vectorized by outer loop vectorization.  But as said, IMHO a waste of time
to cheat badly written benchmarks.

Reply via email to