On Thu, Aug 2, 2018 at 11:12 AM Allan Sandfeld Jensen <li...@carewolf.com> wrote: > > On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote: > > On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote: > > > extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, > > > > > > __artificial__)) > > > > > > _mm_move_sd (__m128d __A, __m128d __B) > > > { > > > > > > - return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B); > > > + return __extension__ (__m128d)(__v2df){__B[0],__A[1]}; > > > > > > } > > > > If the goal is to have it represented as a VEC_PERM_EXPR internally, I > > wonder if we should be explicit and use __builtin_shuffle instead of > > relying on some forwprop pass to transform it. Maybe not, just asking. And > > the answer need not even be the same for _mm_move_sd and _mm_move_ss. > > I wrote it this way because this pattern could later also be used for the > other _ss intrinsics, such as _mm_add_ss, where a _builtin_shuffle could not. > To match the other intrinsics the logic that tries to match vector > construction just needs to be extended to try merge patterns even if one of > the subexpressions is not simple.
The question is what users expect and get when they use -O0 with intrinsics? Richard. > 'Allan > >