The vectorizer is too restricted in the way it decides by how many iterations to peel a loop in order to align a certain memory reference in a loop. It considers only the first (potentially) misaligned store it encounters in the loop. For this reason the testcases vect-multitypes-1.c, vect-multitypes-4.c and vect-iv-4.c don't get vectorized. For example (using Vector Size of 16 bytes), in vect-multitypes-1.c we have:
short sa[N], sb[N]; int ia[N], ib[N]; for (i = 0; i < n; i++) { ia[i+3] = ib[i]; sa[i+3] = sb[i]; } The current peeling-for-alignment scheme will consider the 'ia[i+3]' access for peeling, and therefore will examine the option of using a peeling factor = (4-3)%4 = 1. This will not align the access 'sa[i+3]', for which we need to peel 5 iterations. As a result the loop doesn't get vectorized (cause we currently can't handle misaligned stores unless we align them by peeling). However, if we had considered the 'sa[i+3]' access as well for peeling, we would have examined the option of using a peeling factor = (8-3)%8 = 5, which would align both accesses, and would allow us to vectorize the loop. So the vectorizer needs to be extended to consider more peeling factors, and not just one. -- Summary: missed vectorization due to too strict peeling-for- alignment policy Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: dorit at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31946