On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> >    Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)),
> > optimizer wouldn't simplify if op0 has different mode with M, but that's too
> > restrict which will prevent below optimization, the condition can be relaxed
> > to op0 must have same inner mode with M.
> >
> > (set (reg:V2DF 87 [ xx ])
> >      (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> >              (parallel [
> >                      (const_int 2 [0x2])
> >                  ]))
> >          (vec_select:DF (reg:V4DF 92)
> >              (parallel [
> >                      (const_int 3 [0x3])
> >                  ]))))
> >
> >    Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> >    Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >       * simplify-rtx.c
> >       (simplify_context::simplify_binary_operation_1): Relax
> >       condition of simplifying (vec_concat:M (vec_select op0
> >       index0)(vec_select op1 index1)) to allow different modes
> >       between op0 and M, but have same inner mode.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/i386/vect-rebuild.c:
> >       * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> Funny, I was looking at something rather similar recently, but never
> pushed on it because we were going to need too many entries in the
> parallel selector.
>
> I'm not convinced that we need the inner mode to match anything.  As
> long as the vec_concat's mode is twice the size of the vec_select modes
> and the vec_select mode is <= the mode of its operands ISTM this is
> fine.   We  might want the modes of the vec_select to match, but I don't
> think that's strictly necessary either, they just need to be the same
> size.  ie, we could have somethig like
If they're different sizes, i.e, something like below should also be legal?
(vec_concat:V8SF (vec_select:V2SF (reg:V16SF)) (vec_select:V6SF (reg:V16SF)))
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
>
> I'm not sure if that level of generality is useful though.  If we want
> the modes of the vec_selects to match I think we could still support
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
>
> Thoughts?
>
> jeff
>
> Jeff
>
>
> > ---
> >   gcc/simplify-rtx.c                            |  3 ++-
> >   .../gcc.target/i386/avx512f-vect-rebuild.c    | 21 +++++++++++++++++++
> >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> >   3 files changed, 24 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> >
> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > index ebad5cb5a79..16286befd79 100644
> > --- a/gcc/simplify-rtx.c
> > +++ b/gcc/simplify-rtx.c
> > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > (rtx_code code,
> >       if (GET_CODE (trueop0) == VEC_SELECT
> >           && GET_CODE (trueop1) == VEC_SELECT
> >           && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > -         && GET_MODE (XEXP (trueop0, 0)) == mode)
> > +         && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > +            == GET_MODE_INNER(mode))
> >         {
> >           rtx par0 = XEXP (trueop0, 1);
> >           rtx par1 = XEXP (trueop1, 1);
> > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > new file mode 100644
> > index 00000000000..aef6855aa46
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > +
> > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > +
> > +v2df h (v4df x)
> > +{
> > +  v2df xx = { x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +v4df f2 (v4df x)
> > +{
> > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "unpck" } } */
> > +/* { dg-final { scan-assembler-not "valign" } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > index 570967f6b5c..8e85b98bf1d 100644
> > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > @@ -30,4 +30,4 @@ v2df h (v4df x)
> >
> >   /* { dg-final { scan-assembler-not "unpck" } } */
> >   /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
> > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
>


-- 
BR,
Hongtao

Reply via email to