https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104441

            Bug ID: 104441
           Summary: [12 Regression] vzeroupper is placed at the wrong
                    place
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com, lili.cui at intel dot com
  Target Milestone: ---
            Target: x86-64

Created attachment 52375
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52375&action=edit
A testcase

When compiled with -march=skylake -Wno-attributes, GCC 12 generated:

.L3:
        vmovd   (%rdi), %xmm0
        vmovd   (%rdi,%r13), %xmm1
        vpinsrd $1, (%rdi,%r12), %xmm1, %xmm1
        vpinsrd $1, (%rdi,%rsi), %xmm0, %xmm0
        vmovd   (%rax,%rbx), %xmm2
        vinserti128     $0x1, %xmm1, %ymm0, %ymm0
        vmovd   (%rax), %xmm1
        vpinsrd $1, (%rax,%rcx), %xmm1, %xmm1
        vpinsrd $1, (%rax,%r11), %xmm2, %xmm2
        addl    $4, %edx
        vinserti128     $0x1, %xmm2, %ymm1, %ymm1
        vpsadbw %ymm1, %ymm0, %ymm0
        vpaddd  %ymm0, %ymm3, %ymm0
        vmovdqa %ymm0, %ymm3
        addq    %r10, %rdi
        addq    %r9, %rax
        cmpl    %r8d, %edx
        jb      .L3
        vzeroupper   <<<<<<<<<<< Clear upper 128bits.
        popq    %rbx
        popq    %r12
        vextracti128    $0x1, %ymm3, %xmm3  << The upper 128bits of YMM3 are
used.
        vpaddd  %xmm3, %xmm0, %xmm0
        popq    %r13
        vmovd   %xmm0, %eax
        popq    %rbp

This is triggered by

commit 9775e465c1fbfc32656de77c618c61acf5bd905d
Author: H.J. Lu <hjl.to...@gmail.com>
Date:   Tue Jul 27 07:46:04 2021 -0700

    x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register

Reply via email to