https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70482
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2016-04-01
Component|tree-optimization |target
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, vectorization _does_ happen - it just happens in an awkward way
(we just vectorize the store). We vectorize all of it with -mprefer-avx128.
Note that the vectorizer thinks vectorizing it in the awkward way is
profitable:
1: note: Cost model analysis:
Vector inside of basic block cost: 1
Vector prologue cost: 5
Vector epilogue cost: 0
Scalar cost of basic block: 8
if it weren't it would try vectorizing with smaller vector size. I think
it under-estimates vector construction cost here (prologue cost). From i386.c:
case vec_construct:
elements = TYPE_VECTOR_SUBPARTS (vectype);
return ix86_cost->vec_stmt_cost * (elements / 2 + 1);
But in the assembler I see 8 vector instructions plus the store. vec_construct
is supposed to handle the case of building up a vector from element registers.
Note the same is used for simple splats... detailed analysis is possible
in the ix86_add_stmt_cost hook - but it might be "somewhat" awkward to
extract enough info from the stmt_info the vectorizer passes down... (which
stmt_info is passed down might also be somewhat random, not sure).
Note the cost model is disabled in the vect.exp testsuite.