Re: [PATCH][x86] Match movss and movsd "blend" instructions

Richard Biener Thu, 02 Aug 2018 02:19:40 -0700

On Thu, Aug 2, 2018 at 11:12 AM Allan Sandfeld Jensen
<li...@carewolf.com> wrote:
>
> On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> > On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> > >  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
> > >
> > > __artificial__))
> > >
> > >  _mm_move_sd (__m128d __A, __m128d __B)
> > >  {
> > >
> > > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > >
> > >  }
> >
> > If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> > wonder if we should be explicit and use __builtin_shuffle instead of
> > relying on some forwprop pass to transform it. Maybe not, just asking. And
> > the answer need not even be the same for _mm_move_sd and _mm_move_ss.
>
> I wrote it this way because this pattern could later also be used for the
> other _ss intrinsics, such as _mm_add_ss, where a _builtin_shuffle could not.
> To match the other intrinsics the logic that tries to match vector
> construction just needs to be extended to try merge patterns even if one of
> the subexpressions is not simple.


The question is what users expect and get when they use -O0 with intrinsics?

Richard.

> 'Allan
>
>

Re: [PATCH][x86] Match movss and movsd "blend" instructions

Reply via email to