https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102543
Bug ID: 102543 Summary: -march=cascadelake performs odd alignment peeling Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- For gcc.dg/torture/pr65270-1.c we choose to misalign an aligned store + load combo for runtime aligning a single load because we have (skylake_cost): {6, 6, 6, 10, 20}, /* cost of loading SSE register in 32bit, 64bit, 128bit, 256bit and 512bit */ {8, 8, 8, 12, 24}, /* cost of storing SSE register in 32bit, 64bit, 128bit, 256bit and 512bit */ {6, 6, 6, 10, 20}, /* cost of unaligned loads. */ {8, 8, 8, 8, 16}, /* cost of unaligned stores. */ which means that an unaligned store is cheaper than an aligned store for %ymm and even more so for %zmm!??