https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2017-01-20 Version|unknown |7.0 Blocks| |53947 Summary|Missed vectorization with |Missed BB vectorization |identical formulas |with strided/scalar stores Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- The basic-block vectorizer does not yet consider strided/scalar stores as a source in its search for vectorization opportunities so it gives up very early. Basically it searchs for groups of stores that can be vectorized with a vector store and then looks at how many of the feeding stmts it can include. Handling this particular case is hard in the current scheme (or rather expensive). Confirmed. "Fixing" the testcase to void scalar(const double *restrict a, const double *restrict b, double x, double *ar, double *br) { double ra, rb; int i; ra = a[0] + a[1]/x - 1.0/(a[0]-a[1]); rb = b[0] + b[1]/x - 1.0/(b[0]-b[1]); ar[0] = ra; ar[1] = rb; } fails as well with t.c:12:1: note: Build SLP for _1 = *a_14(D); t.c:12:1: note: Build SLP for _7 = *b_17(D); t.c:12:1: note: Build SLP failed: different interleaving chains in one node _7 = *b_17(D); t.c:12:1: note: Re-trying with swapped operands of stmts 1 t.c:12:1: note: Build SLP for _1 = *a_14(D); t.c:12:1: note: Build SLP for _9 = _8 / x_15(D); t.c:12:1: note: Build SLP failed: different operation in stmt _9 = _8 / x_15(D); t.c:12:1: note: original stmt _1 = *a_14(D); but we could handle this with "construction from scalars" and just get confused by the first mismatch and optimistically trying to swap operands. As said above the SLP finding algorithm is very much too greedy (with too many accumulated hacks). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations