The following loop does not get vectorized
on powerpc64-linux, r130275, GCC 4.3.0:
#define M 10
struct S
{
float x;
float y;
} pS[100];
float a[1000];
float b[1000];
void
foo (int n)
{
int i, j;
for (i = 0; i < n; i++)
{
pS[i].x = 0;
pS[i].y = 0;
for (j = 0; j < M; j++)
{
pS[i].x += (a[i]+b[i]);
pS[i].y += (a[i]-b[i]);
}
}
}
Here is a snippet from the vectorizer dump file:
u3.c:17: note: dependence distance modulo vf == 0 between pS[i_37].x and
pS[i_37].x
u3.c:17: note: dependence distance = 0.
u3.c:17: note: accesses have the same alignment.
u3.c:17: note: dependence distance modulo vf == 0 between pS[i_37].y and
pS[i_37].y
u3.c:17: note: === vect_analyze_data_ref_accesses ===
u3.c:17: note: Detected interleaving of size 2
u3.c:17: note: not vectorized: complicated access pattern.
u3.c:17: note: bad data access.(get_loop_exit_condition
...
base_address: &pS
offset from base address: (<unnamed-signed:32>) ((unsigned int) i_37 *
8)
constant offset from base address: 0
step: 0
aligned to: 8
base_object: pS[0].x
symbol tag: pS
FAILED as dr address is invariant
u3.c:22: note: not vectorized: unhandled data-ref
u3.c:22: note: bad data references.
u3.c:14: note: vectorized 0 loops in function.
[Zdenek's patch which extends lim can help to do store motion and thus help to
the vectorizer - http://gcc.gnu.org/ml/gcc-patches/2007-01/msg02331.html, but
AFAICT it is not applicable to current mainline)]
--
Summary: missed optimization with store motion (vectorizer)
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: eres at il dot ibm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34195