Hi,
The following reduced testcase shows the issue:
struct f
{
float a[8];
};
void set(struct f *a, float b)
{
int i = 0;
for(i=0;i<8;i++)
a->a[i] = b;
}
--- CUT ---
Currently we vectorize this loop when really unrolling would perform better on
SPU, Altivec (PPC) and SSE (x86).
For Cell's PPC, it would cause a LHS as we have to transfer between the
floating point registers and the vector registers.
--
Summary: Vectorizer is causing code bloat and worse performance
than unrolling would for a loop in SPEC 2k's eon
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pinskia at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37579