https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70482
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2016-04-01 Component|tree-optimization |target Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Hmm, vectorization _does_ happen - it just happens in an awkward way (we just vectorize the store). We vectorize all of it with -mprefer-avx128. Note that the vectorizer thinks vectorizing it in the awkward way is profitable: 1: note: Cost model analysis: Vector inside of basic block cost: 1 Vector prologue cost: 5 Vector epilogue cost: 0 Scalar cost of basic block: 8 if it weren't it would try vectorizing with smaller vector size. I think it under-estimates vector construction cost here (prologue cost). From i386.c: case vec_construct: elements = TYPE_VECTOR_SUBPARTS (vectype); return ix86_cost->vec_stmt_cost * (elements / 2 + 1); But in the assembler I see 8 vector instructions plus the store. vec_construct is supposed to handle the case of building up a vector from element registers. Note the same is used for simple splats... detailed analysis is possible in the ix86_add_stmt_cost hook - but it might be "somewhat" awkward to extract enough info from the stmt_info the vectorizer passes down... (which stmt_info is passed down might also be somewhat random, not sure). Note the cost model is disabled in the vect.exp testsuite.