https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102543

            Bug ID: 102543
           Summary: -march=cascadelake performs odd alignment peeling
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

For gcc.dg/torture/pr65270-1.c we choose to misalign an aligned store + load
combo for runtime aligning a single load because we have (skylake_cost):

  {6, 6, 6, 10, 20},                    /* cost of loading SSE register
                                           in 32bit, 64bit, 128bit, 256bit and
512bit */
  {8, 8, 8, 12, 24},                    /* cost of storing SSE register
                                           in 32bit, 64bit, 128bit, 256bit and
512bit */
  {6, 6, 6, 10, 20},                    /* cost of unaligned loads.  */
  {8, 8, 8, 8, 16},                     /* cost of unaligned stores.  */

which means that an unaligned store is cheaper than an aligned store for
%ymm and even more so for %zmm!??

Reply via email to