[Bug tree-optimization/94962] New: Suboptimal AVX2 code for _mm256_zextsi128_si256(_mm_set1_epi8(-1))

n...@self-evident.org Tue, 05 May 2020 14:41:43 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962


            Bug ID: 94962
           Summary: Suboptimal AVX2 code for
                    _mm256_zextsi128_si256(_mm_set1_epi8(-1))
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: n...@self-evident.org
  Target Milestone: ---

Background: https://stackoverflow.com/q/61601902/

GCC emits an unnecessary "vmovdqa xmm0,xmm0" for the following code:

     __m256i mask()
    {
        return _mm256_zextsi128_si256(_mm_set1_epi8(-1));
    }

Live example on godbolt: https://gcc.godbolt.org/z/PbsQDR

I have found no way to avoid this except by resorting to inline asm.

[Bug tree-optimization/94962] New: Suboptimal AVX2 code for _mm256_zextsi128_si256(_mm_set1_epi8(-1))

Reply via email to