https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102021
Bug ID: 102021 Summary: Redudant mov instruction for broadcast. Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-*-* i?86-*-* #include<immintrin.h> __m256i foo () { return _mm256_set1_epi16 (12); } foo(): movabsq $3377751260921868, %rax vpbroadcastq %rax, %ymm31 vmovdqa64 %ymm31, %ymm0 ret I guess scratch sse register somehow prevent LRA to merge move instructions. Maybe we should add define_peephole2 for those if we still want to use ix86_gen_scratch_sse_rtx.