Hello, In late 2008 there was a post about a wrong-code bug on hardwarebug.org (http://hardwarebug.org/2008/11/28/codesourcery-fails-again/#more-83). There was this test case, and it was not vectorized correctly.
----------------------- extern unsigned char dst[512] __attribute__((aligned(8))); extern unsigned char src[512] __attribute__((aligned(8))); void array_shift(void) { int i; for (i = 0; i < 512; i++) dst[i] = src[i] >> 7; } ------------------- With GCC trunk of today, vectorization doesn't happen at all: t.c:8: note: type of def: 3. t.c:8: note: vect_is_simple_use: operand 7 t.c:8: note: vector/scalar shift/rotate found. t.c:8: note: not worthwhile without SIMD support. t.c:8: note: not vectorized: relevant stmt not supported: D.1963_4 = D.1962_3 >> 7; t.c:8: note: bad operation or unsupported loop bound. t.c:5: note: vectorized 0 loops in function. Compiler is trunk r156595 for arm-elf. Compiler options: "-mcpu=cortex-a8 -mfpu=neon -O3 -fdump-tree-vect-details" -- Summary: Missed vectorization on ARM NEON Product: gcc Version: 4.5.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: steven at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43001