On Thu, Aug 02, 2018 at 10:50:58PM +0200, Allan Sandfeld Jensen wrote:
> Here is the version with __builtin_shuffle. It might be more expectable -O0, 
> but it is also uglier.

I don't find anything ugly on it, except the formatting glitches (missing
space before (, overlong line, and useless __extension__.
Improving code generated for __builtin_shuffle is desirable too.

> --- a/gcc/config/i386/xmmintrin.h
> +++ b/gcc/config/i386/xmmintrin.h
> @@ -1011,7 +1011,8 @@ _mm_storer_ps (float *__P, __m128 __A)
>  extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_move_ss (__m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_movss ((__v4sf)__A, (__v4sf)__B);
> +  return __extension__ (__m128) __builtin_shuffle((__v4sf)__A, (__v4sf)__B,
> +                                                  
> (__attribute__((__vector_size__ (16))) int){4, 1, 2, 3});

And obviously use __v4si here instead of __attribute__((__vector_size__ (16))) 
int.

        Jakub

Reply via email to