The vectorizer is too restricted in the way it decides by how many iterations
to peel a loop in order to align a certain memory reference in a loop. It
considers only the first (potentially) misaligned store it encounters in the
loop. For this reason the testcases vect-multitypes-1.c, vect-multitypes-4.c
and vect-iv-4.c don't get vectorized. For example (using Vector Size of 16
bytes), in vect-multitypes-1.c we have:

short sa[N], sb[N];
int ia[N], ib[N];  
for (i = 0; i < n; i++) {
      ia[i+3] = ib[i];
      sa[i+3] = sb[i];
}

The current peeling-for-alignment scheme will consider the 'ia[i+3]' access for
peeling, and therefore will examine the option of using a peeling factor =
(4-3)%4 = 1. This will not align the access 'sa[i+3]', for which we need to
peel 5 iterations. As a result the loop doesn't get vectorized (cause we
currently can't handle misaligned stores unless we align them by peeling).
However, if we had considered the 'sa[i+3]' access as well for peeling, we
would have examined the option of using a peeling factor = (8-3)%8 = 5, which
would align both accesses, and would allow us to vectorize the loop. So the
vectorizer needs to be extended to consider more peeling factors, and not just
one.


-- 
           Summary: missed vectorization due to too strict peeling-for-
                    alignment policy
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dorit at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31946

Reply via email to