On Fri, Feb 28, 2020 at 6:15 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Fri, Feb 28, 2020 at 4:16 PM Jeff Law <l...@redhat.com> wrote:
> >
> > On Thu, 2020-02-27 at 06:50 -0800, H.J. Lu wrote:
> > >
> > > How about this?  If it looks OK, I will post the whole patch set.
> > It's better.  I'm guessing the two cases that were previously handled with
> > vextract/vbroadcast aren't supposed to happen?  They're caught here IIUC:
> >
> > > +  /* NB: To move xmm16-xmm31/ymm16-ymm31 registers without AVX512VL,
> > > +     we can only use zmm register move without memory operand.  */
> > > +   if (evex_reg_p
> > > +       && !TARGET_AVX512VL
> > > +       && GET_MODE_SIZE (mode) < 64)
> > > +     {
> > > +       if (memory_operand (operands[0], mode)
> > > +        || memory_operand (operands[1], mode))
> > > +     gcc_unreachable ();
> > >
> >
> > If they truly can't happen, that's fine.  My worry is I don't see changes to
> > the operand predicates or constraints which would avoid this case.   Is it
> > prevented by the mode iterator on the operands?  Again, just want to make 
> > sure
> > I understand why the vextract/vbroadcast stuff isn't in the new code.
>
> There are no GCC testcases to show that they are actually ever used.   That is
> why I removed them and added gcc_unreachable ().

This is covered by the testcases I added:

[hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
#include <x86intrin.h>

extern __m128 d;

void
foo1 (__m128 x)
{
  register __m128 xmm16 __asm ("xmm16") = x;
  asm volatile ("" : "+v" (xmm16));
  d = xmm16;
}
[hjl@gnu-cfl-2 gcc]$ gcc -O2 -march=skylake-avx512  /tmp/x.c -S
[hjl@gnu-cfl-2 gcc]$ gcc -O2 -march=skylake-avx512 -mno-avx512vl  /tmp/x.c -S
/tmp/x.c: In function ‘foo1’:
/tmp/x.c:8:19: error: register specified for ‘xmm16’ isn’t suitable
for data type
    8 |   register __m128 xmm16 __asm ("xmm16") = x;
      |                   ^~~~~
[hjl@gnu-cfl-2 gcc]$

GCC doesn't allow xmm16-xmm31/ymm16-ymm31 without AVX512VL since
ix86_hard_regno_mode_ok has

     /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
      if (TARGET_AVX512VL
          && (mode == OImode
              || mode == TImode
              || VALID_AVX256_REG_MODE (mode)
              || VALID_AVX512VL_128_REG_MODE (mode)))
        return true;

      /* xmm16-xmm31 are only available for AVX-512.  */
      if (EXT_REX_SSE_REGNO_P (regno))
        return false;

The vextract/vbroadcast stuff is dead code.

> > I'm doing a little assuming that the <ssescalarsize> bits in the old code 
> > are
> > mapped correctly to the 32/64 suffixes on the opcodes in the new version.
> >
> > I'm also assuming that mapping of "size" in the argument to ix86_get_ssemov 
> > to
> > the operand modifiers g, t, and x are right.  I'm guessing the operand
> > modifiers weren't needed in the original because we had the actual operand 
> > and
> > could look at it to get the right modifier.  In the evex, but not avx512vl 
> > case
> > those are forced to a g modifier which seems to match the original.
> >
> > Are we going to need further refinements to 
> > ix86_output_ssemov/ix86_get_ssemov?
> > If so, then I'd suggest the next patch be those patterns which don't require
> > further refinements to ix86_output_ssemov.
>
> 4 patches don't require changes in ix86_output_ssemov/ix86_get_ssemov:
>
> https://gitlab.com/x86-gcc/gcc/-/commit/426f2464abb80b97b8533f9efa15bbe72e6aa888
> https://gitlab.com/x86-gcc/gcc/-/commit/ec5b40d77f7a4424935275f1a7ccedbce83b6f54
> https://gitlab.com/x86-gcc/gcc/-/commit/92fdd98234984f86b66fb5403dd828661cd7999f
> https://gitlab.com/x86-gcc/gcc/-/commit/f8fa5e571caf6740b36d042d631b4ace11683cd7
>
> I can combine them into a single patch.
>
> Other 5 patches contain a small change to  ix86_output_ssemov:
>
> https://gitlab.com/x86-gcc/gcc/-/commit/b1746392e1d350d689a80fb71b2c72f909c20f30
> https://gitlab.com/x86-gcc/gcc/-/commit/14c3cbdbdcc36fa1edea4572b89a039726a4e2bc
> https://gitlab.com/x86-gcc/gcc/-/commit/69c8c928b26242116cc261a9d2f6b1265218f1d3
> https://gitlab.com/x86-gcc/gcc/-/commit/04335f582f0b281d5f357185d154087997fd7cfd
> https://gitlab.com/x86-gcc/gcc/-/commit/64f6a5d6d3405331d9c02aaae0faccf449d6647a
>
> Should I made the change and submit them for review?

I am preparing the new patch set.

> > If no further refinements to ix86_output_ssemov/ix86_get_ssemov are 
> > required,
> > then I think you can just send the rest of the pattern changes in a single
> > unit.
> >
> > jeff
> >

-- 
H.J.

Reply via email to