https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102021

            Bug ID: 102021
           Summary: Redudant mov instruction for broadcast.
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-*-* i?86-*-*

#include<immintrin.h>

__m256i
foo ()
{
  return _mm256_set1_epi16 (12);
}


foo():
        movabsq $3377751260921868, %rax
        vpbroadcastq    %rax, %ymm31
        vmovdqa64       %ymm31, %ymm0
        ret

I guess scratch sse register somehow prevent LRA to merge move instructions.

Maybe we should add define_peephole2 for those if we still want to use
ix86_gen_scratch_sse_rtx.

Reply via email to