On Mon, Feb 11, 2019 at 8:08 PM H.J. Lu <[email protected]> wrote:
>
> On Sun, Feb 10, 2019 at 2:48 AM Uros Bizjak <[email protected]> wrote:
> >
> > On 2/10/19, H.J. Lu <[email protected]> wrote:
> > > Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE.
> > >
> > > PR target/89021
> > > * config/i386/mmx.md (sse_cvtps2pi): Add SSE emulation.
> > > (sse_cvttps2pi): Likewise.
> >
> > It looks to me that this description is wrong. We don't have V4SF
> > modes here, but V2SF, so we have to fake 64bit load in case of MMX.
> > The cvtps2dq will access memory in true 128bit width, so this is
> > wrong.
> >
> > We have to fix the description to not fake wide mode.
> >
>
> What do you propose to implement
>
> __m64 _mm_cvtps_pi32 (__m128 __A);
Hm...
In your original patch, we *do* have V4SF memory access, but the
original insn accesses it in __m64 mode. This should be OK, but then
accessing this memory in __m128 mode should also be OK. So, on a more
detailed look, the original patch looks OK to me. Luckily, a false
alarm...
>
> We also have
>
> (define_insn "sse2_cvtps2pd<mask_name>"
> [(set (match_operand:V2DF 0 "register_operand" "=v")
> (float_extend:V2DF
> (vec_select:V2SF
> (match_operand:V4SF 1 "vector_operand" "vm")
> (parallel [(const_int 0) (const_int 1)]))))]
> "TARGET_SSE2 && <mask_avx512vl_condition>"
> "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
>
> These aren't new problems introduced by my MMX work.
This one is not problematic, since the instruction accesses memory in
__m64 mode, which is narrower that V4SFmode.
Uros.