http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636
Bug #: 53636 Summary: SLP may create invalid unaligned memory accesses Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: uweig...@gcc.gnu.org The following test case: void test (unsigned char *dst) { short tmp[11 * 8], *tptr; int i; fill (tmp); tptr = tmp; for (i = 0; i < 8; i++) { dst[0] = (-tptr[0] + 9 * tptr[0 + 1] + 9 * tptr[0 + 2] - tptr[0 + 3]) >> 7; dst[1] = (-tptr[1] + 9 * tptr[1 + 1] + 9 * tptr[1 + 2] - tptr[1 + 3]) >> 7; dst[2] = (-tptr[2] + 9 * tptr[2 + 1] + 9 * tptr[2 + 2] - tptr[2 + 3]) >> 7; dst[3] = (-tptr[3] + 9 * tptr[3 + 1] + 9 * tptr[3 + 2] - tptr[3 + 3]) >> 7; dst[4] = (-tptr[4] + 9 * tptr[4 + 1] + 9 * tptr[4 + 2] - tptr[4 + 3]) >> 7; dst[5] = (-tptr[5] + 9 * tptr[5 + 1] + 9 * tptr[5 + 2] - tptr[5 + 3]) >> 7; dst[6] = (-tptr[6] + 9 * tptr[6 + 1] + 9 * tptr[6 + 2] - tptr[6 + 3]) >> 7; dst[7] = (-tptr[7] + 9 * tptr[7 + 1] + 9 * tptr[7 + 2] - tptr[7 + 3]) >> 7; dst += 8; tptr += 11; } } when built on ARM with -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp -O -ftree-vectorize creates code that uses a VLDR instruction to access unaligned memory, which causes a Bus error at runtime. The problem seems to be that the check in vect_compute_data_ref_alignment is not enough for SLP. Even though SLP only considers a basic blokc, the data-ref analysis still looks at innermost loops to compute scalar evolutions. This results in concluding that the access "tptr[0]" is based on "tmp", which is aligned to 8 bytes, using a step of 22 bytes. The alignment check now only verified that the *base* is aligned. This is OK if we're actually vectorizing the loop. But in the SLP case, we really need to verify instead that the access is aligned on *every* iteration through the loop ...