5.1.0 miss optimisation with vpmovmskb

ubizjak at gmail dot com Thu, 04 Jun 2015 10:59:52 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66369


--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Marcus Kool from comment #8)

> Can you confirm that the code has
>      return __builtin_ctzl(v);

__inline__ long find_pos32( unsigned char ch, mycharset32 set )
{
   __m256i regchx256;
   __m256i regset256;
   long v;

   regchx256 = _mm256_set1_epi8( ch );
   regset256 = _mm256_loadu_si256( (__m256i const *) set );
   v = (unsigned int) _mm256_movemask_epi8(
_mm256_cmpeq_epi8(regchx256,regset256) );
   if (v != 0L)
      return (long) __builtin_ctzl( v );
   return -1;
}

> Thanks for the patch, but the required cast to unsigned int is
> counter-intuitive and it is likely that nobody will use this cast in their
> code and hence miss the optimisation.  Isn't there a more elegant solution?

No.

[Bug target/66369] gcc 4.8.3/5.1.0 miss optimisation with vpmovmskb

Reply via email to