https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102543
--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 13 Oct 2021, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102543 > > --- Comment #9 from Hongtao.liu <crazylht at gmail dot com> --- > I'm curious why we need peeling for unaligned access, because unaligned access > instructions should also be available for aligned addresses, can't we just > mark > mem_ref as unaligned (although this is fake, just to generate unaligned > instructions for the back end only) The costing is not for movaps vs movups but for movups on aligned vs. unaligned storage. So to make the access fast the costing tells us that the access has to be actually unaligned. Anyhow, the vectorizer does not consider to actively misalign in case all accesses are known to be aligned - but what happens is that if there's at least one unaligned access it evaluates the costs of aligning that access vs. aligning the other accesses and the bug makes it appear that aligning a single access is cheaper than aligning multiple accesses (even if those are already aligned and thus would require no peeling at all).