Hi, The following reduced testcase shows the issue: struct f { float a[8]; };
void set(struct f *a, float b) { int i = 0; for(i=0;i<8;i++) a->a[i] = b; } --- CUT --- Currently we vectorize this loop when really unrolling would perform better on SPU, Altivec (PPC) and SSE (x86). For Cell's PPC, it would cause a LHS as we have to transfer between the floating point registers and the vector registers. -- Summary: Vectorizer is causing code bloat and worse performance than unrolling would for a loop in SPEC 2k's eon Product: gcc Version: 4.4.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: pinskia at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37579