The following loop does not get vectorized on powerpc64-linux, r130275, GCC 4.3.0:
#define M 10 struct S { float x; float y; } pS[100]; float a[1000]; float b[1000]; void foo (int n) { int i, j; for (i = 0; i < n; i++) { pS[i].x = 0; pS[i].y = 0; for (j = 0; j < M; j++) { pS[i].x += (a[i]+b[i]); pS[i].y += (a[i]-b[i]); } } } Here is a snippet from the vectorizer dump file: u3.c:17: note: dependence distance modulo vf == 0 between pS[i_37].x and pS[i_37].x u3.c:17: note: dependence distance = 0. u3.c:17: note: accesses have the same alignment. u3.c:17: note: dependence distance modulo vf == 0 between pS[i_37].y and pS[i_37].y u3.c:17: note: === vect_analyze_data_ref_accesses === u3.c:17: note: Detected interleaving of size 2 u3.c:17: note: not vectorized: complicated access pattern. u3.c:17: note: bad data access.(get_loop_exit_condition ... base_address: &pS offset from base address: (<unnamed-signed:32>) ((unsigned int) i_37 * 8) constant offset from base address: 0 step: 0 aligned to: 8 base_object: pS[0].x symbol tag: pS FAILED as dr address is invariant u3.c:22: note: not vectorized: unhandled data-ref u3.c:22: note: bad data references. u3.c:14: note: vectorized 0 loops in function. [Zdenek's patch which extends lim can help to do store motion and thus help to the vectorizer - http://gcc.gnu.org/ml/gcc-patches/2007-01/msg02331.html, but AFAICT it is not applicable to current mainline)] -- Summary: missed optimization with store motion (vectorizer) Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: eres at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34195