ping

On Mon, Sep 13, 2021 at 11:19 PM Hongtao Liu <crazy...@gmail.com> wrote:
>
> On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> >
> >
> > On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> > >    Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 
> > > idx2)),
> > > optimizer wouldn't simplify if op0 has different mode with M, but that's 
> > > too
> > > restrict which will prevent below optimization, the condition can be 
> > > relaxed
> > > to op0 must have same inner mode with M.
> > >
> > > (set (reg:V2DF 87 [ xx ])
> > >      (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> > >              (parallel [
> > >                      (const_int 2 [0x2])
> > >                  ]))
> > >          (vec_select:DF (reg:V4DF 92)
> > >              (parallel [
> > >                      (const_int 3 [0x3])
> > >                  ]))))
> > >
> > >    Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> > >    Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > >       * simplify-rtx.c
> > >       (simplify_context::simplify_binary_operation_1): Relax
> > >       condition of simplifying (vec_concat:M (vec_select op0
> > >       index0)(vec_select op1 index1)) to allow different modes
> > >       between op0 and M, but have same inner mode.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >       * gcc.target/i386/vect-rebuild.c:
> > >       * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> > Funny, I was looking at something rather similar recently, but never
> > pushed on it because we were going to need too many entries in the
> > parallel selector.
> >
> > I'm not convinced that we need the inner mode to match anything.  As
> > long as the vec_concat's mode is twice the size of the vec_select modes
> > and the vec_select mode is <= the mode of its operands ISTM this is
> > fine.   We  might want the modes of the vec_select to match, but I don't
> > think that's strictly necessary either, they just need to be the same
> > size.  ie, we could have somethig like
> If they're different sizes, i.e, something like below should also be legal?
> (vec_concat:V8SF (vec_select:V2SF (reg:V16SF)) (vec_select:V6SF (reg:V16SF)))
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
> >
> > I'm not sure if that level of generality is useful though.  If we want
> > the modes of the vec_selects to match I think we could still support
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
> >
> > Thoughts?
> >
> > jeff
> >
> > Jeff
> >
> >
> > > ---
> > >   gcc/simplify-rtx.c                            |  3 ++-
> > >   .../gcc.target/i386/avx512f-vect-rebuild.c    | 21 +++++++++++++++++++
> > >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> > >   3 files changed, 24 insertions(+), 2 deletions(-)
> > >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > >
> > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > > index ebad5cb5a79..16286befd79 100644
> > > --- a/gcc/simplify-rtx.c
> > > +++ b/gcc/simplify-rtx.c
> > > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > > (rtx_code code,
> > >       if (GET_CODE (trueop0) == VEC_SELECT
> > >           && GET_CODE (trueop1) == VEC_SELECT
> > >           && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > > -         && GET_MODE (XEXP (trueop0, 0)) == mode)
> > > +         && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > > +            == GET_MODE_INNER(mode))
> > >         {
> > >           rtx par0 = XEXP (trueop0, 1);
> > >           rtx par1 = XEXP (trueop1, 1);
> > > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > > new file mode 100644
> > > index 00000000000..aef6855aa46
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > > +
> > > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > > +
> > > +v2df h (v4df x)
> > > +{
> > > +  v2df xx = { x[2], x[3] };
> > > +  return xx;
> > > +}
> > > +
> > > +v4df f2 (v4df x)
> > > +{
> > > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > > +  return xx;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not "unpck" } } */
> > > +/* { dg-final { scan-assembler-not "valign" } } */
> > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 
> > > 1 } } */
> > > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > index 570967f6b5c..8e85b98bf1d 100644
> > > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > @@ -30,4 +30,4 @@ v2df h (v4df x)
> > >
> > >   /* { dg-final { scan-assembler-not "unpck" } } */
> > >   /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
> > > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
> > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 
> > > 1 } } */
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

Reply via email to