http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58280

--- Comment #5 from Freddie Witherden <freddie at witherden dot org> ---
Thank you for this information.  As an alternative would it be worth
considering a pragma along the lines of:

#pragma gcc aligned(32)

which would confer that "in the first iteration of the loop which follows all
relevant variables can be taken as having 32-byte alignment."  This would
provide quite a nice way of allowing loops like the above to be fully
vectorized and further avoid the need for explicit calls to
__builtin_assume_aligned.

ICC has a similar directive but it only applies to the base pointers.  So it
would assume that "a" is aligned but not "a + i*ldim".

Reply via email to