sse_cvttps2pi with SSE

Uros Bizjak Mon, 11 Feb 2019 11:52:43 -0800

On Mon, Feb 11, 2019 at 8:08 PM H.J. Lu <[email protected]> wrote:
>
> On Sun, Feb 10, 2019 at 2:48 AM Uros Bizjak <[email protected]> wrote:
> >
> > On 2/10/19, H.J. Lu <[email protected]> wrote:
> > > Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE.
> > >
> > >       PR target/89021
> > >       * config/i386/mmx.md (sse_cvtps2pi): Add SSE emulation.
> > >       (sse_cvttps2pi): Likewise.
> >
> > It looks to me that this description is wrong. We don't have V4SF
> > modes here, but V2SF, so we have to fake 64bit load in case of MMX.
> > The cvtps2dq will access memory in true 128bit width, so this is
> > wrong.
> >
> > We have to fix the description to not fake wide mode.
> >
>
> What do you propose to implement
>
> __m64 _mm_cvtps_pi32 (__m128 __A);


Hm...

In your original patch, we *do* have V4SF memory access, but the
original insn accesses it in __m64 mode. This should be OK, but then
accessing this memory in __m128 mode should also be OK. So, on a more
detailed look, the original patch looks OK to me. Luckily, a false
alarm...

>
> We also have
>
> (define_insn "sse2_cvtps2pd<mask_name>"
>   [(set (match_operand:V2DF 0 "register_operand" "=v")
>         (float_extend:V2DF
>           (vec_select:V2SF
>             (match_operand:V4SF 1 "vector_operand" "vm")
>             (parallel [(const_int 0) (const_int 1)]))))]
>   "TARGET_SSE2 && <mask_avx512vl_condition>"
>   "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
>
> These aren't new problems introduced by my MMX work.

This one is not problematic, since the instruction accesses memory in
__m64 mode, which is narrower that V4SFmode.

Uros.

Re: [PATCH 14/43] i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE

Reply via email to