On Wed, 28 May 2025, Icen Zeyada wrote:

> Hi Richard,
> I've implemented some of your suggested changes, but I'm not entirely sure 
> there's an elegant way to handle the second one:
> > "So here you'd want to verify we can to LT_EXPR for the types involved, and 
> > the cases which simplify to constant_boolean_node do not need any such 
> > check. Possibly the same issue applies to the cases below; I did not 
> > verify."
> Most of those expressions are selected from `code1` or `code2` in the switch 
> statements, while the rest—like the example you mentioned—are their folded or 
> simplified forms (e.g., `NE` and `LE` becoming `LT`). How can I determine 
> those expressions at the start of the simplification?
> Or are you suggesting that `expand_vec_cmp_expr_p` should be distributed 
> within the functions—that is, inside the conditionals that decide which 
> expression to return? So we would end up with something like:
> ```
> (if (code1 == NE_EXPR
> && code2 == LE_EXPR
> && cmp == 0
> && (allbits
> || (VECTOR_BOOLEAN_TYPE_P (type)
> && expand_vec_cmp_expr_p (TREE_TYPE (@1), type, LT_EXPR))))
> (lt @c0 (convert @1)))
> ```
> ...applied across all expressions?

Yes, this is what I would suggest.

Richard.

> Kind Regards,
> Icen
> 
> ________________________________
> From: Richard Biener <rguent...@suse.de>
> Sent: 27 May 2025 13:47
> To: Icen Zeyada <icen.zeya...@arm.com>
> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; jeffreya...@gmail.com 
> <jeffreya...@gmail.com>; i...@airs.com <i...@airs.com>; Richard Earnshaw 
> <richard.earns...@arm.com>; pins...@gmail.com <pins...@gmail.com>; Victor Do 
> Nascimento <victor.donascime...@arm.com>; Tamar Christina 
> <tamar.christ...@arm.com>
> Subject: Re: [PATCH v3 2/2] gimple-fold: extend vector simplification to 
> match scalar bitwise optimizations [PR119196]
> 
> On Wed, 21 May 2025, Icen Zeyada wrote:
> 
> >     Generalize existing scalar gimple_fold rules to apply the same
> >     bitwise comparison simplifications to vector types.  Previously, an
> >     expression like
> >
> >         (x < y) && (x > y)
> >
> >     would fold to `false` if x and y are scalars, but equivalent vector
> >     comparisons were left untouched.  This patch enables folding of
> >     patterns of the form
> >
> >         (cmp x y) bit_and (cmp x y)
> >         (cmp x y) bit_ior (cmp x y)
> >         (cmp x y) bit_xor (cmp x y)
> >
> >     for vector operands as well, ensuring consistent optimization across
> >     all data types.
> >
> >     PR tree-optimization/119196
> >
> >     gcc/ChangeLog:
> >
> >       * match.pd: Allow scalar optimizations with bitwise AND/OR/XOR to 
> > apply to vectors.
> >
> >     gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/aarch64/vector-compare-5.c: Add new test for vector 
> > compare simplification.
> >
> > Signed-off-by: Icen Zeyada <icen.zeya...@arm.com>
> > ---
> >  gcc/match.pd                                  | 16 ++++-
> >  .../gcc.target/aarch64/vector-compare-5.c     | 67 +++++++++++++++++++
> >  2 files changed, 81 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/vector-compare-5.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 611f05ef9f9c..7a7df6aeb6c5 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3635,6 +3635,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >     (if ((TREE_CODE (@1) == INTEGER_CST
> >         && TREE_CODE (@2) == INTEGER_CST)
> >        || ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
> > +   || (VECTOR_TYPE_P (TREE_TYPE (@1))
> 
> Note this does not verify we are doing a vector compare, our IL
> allows vector ==/!= vector to scalar bool compares.  The appropriate
> test should be VECTOR_BOOLEAN_TYPE_P (type) to check for
> a vector compare (to gate an expand_vec_cmp_expr_p check)
> and for the bitwise_equal_p guard your change looks OK.
> 
> > +   && expand_vec_cmp_expr_p (TREE_TYPE (@1), type, code2))
> 
> The expand_vec_cmp_expr_p is misplaced - we generate not 'code2'
> but a comparison code depending on it, like for
> 
>       (if (code1 == NE_EXPR
>            && code2 == LE_EXPR
>            && cmp == 0
>            && allbits)
>        (lt @c0 (convert @1)))
> 
> so here you'd want to verify we can to LT_EXPR for the types involved
> and the cases which simplify to constant_boolean_node do not need
> any such check.  Possibly the same issue applies to the cases below,
> I did not verify.
> 
> Thanks,
> Richard.
> 
> >             || POINTER_TYPE_P (TREE_TYPE (@1)))
> >            && bitwise_equal_p (@1, @2)))
> >      (with
> > @@ -3712,6 +3714,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >    (if ((TREE_CODE (@1) == INTEGER_CST
> >        && TREE_CODE (@2) == INTEGER_CST)
> >         || ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
> > +       || (VECTOR_TYPE_P (TREE_TYPE (@1))
> > +       && expand_vec_cmp_expr_p (TREE_TYPE (@1), type, code2))
> >            || POINTER_TYPE_P (TREE_TYPE (@1)))
> >           && operand_equal_p (@1, @2)))
> >     (with
> > @@ -3762,6 +3766,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >     (if ((TREE_CODE (@1) == INTEGER_CST
> >         && TREE_CODE (@2) == INTEGER_CST)
> >        || ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
> > +   || (VECTOR_TYPE_P (TREE_TYPE (@1))
> > +   && expand_vec_cmp_expr_p (TREE_TYPE (@1), type, code2))
> >            || POINTER_TYPE_P (TREE_TYPE (@1)))
> >            && bitwise_equal_p (@1, @2)))
> >      (with
> > @@ -3885,7 +3891,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >        rcmp (ne le gt ne lt ge)
> >    (simplify
> >     (op:c (cmp1:c @0 @1) (cmp2 @0 @1))
> > -   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE 
> > (@0)))
> > +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +      || POINTER_TYPE_P (TREE_TYPE (@0))
> > +      || (VECTOR_TYPE_P (TREE_TYPE (@1))
> > +      && expand_vec_cmp_expr_p (TREE_TYPE (@0), type, rcmp)))
> >      (rcmp @0 @1)))))
> >
> >  /* Optimize (a CMP b) == (a CMP b)  */
> > @@ -3894,7 +3903,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >       rcmp (eq gt le eq ge lt)
> >   (simplify
> >    (eq:c (cmp1:c @0 @1) (cmp2 @0 @1))
> > -  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0)))
> > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +             || POINTER_TYPE_P (TREE_TYPE (@0))
> > +      || (VECTOR_TYPE_P (TREE_TYPE (@0))
> > +      && expand_vec_cmp_expr_p (TREE_TYPE (@0), type,  rcmp)))
> >      (rcmp @0 @1))))
> >
> >  /* (type)([0,1]@a != 0) -> (type)a
> > diff --git a/gcc/testsuite/gcc.target/aarch64/vector-compare-5.c 
> > b/gcc/testsuite/gcc.target/aarch64/vector-compare-5.c
> > new file mode 100644
> > index 000000000000..a1a601dc1958
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/vector-compare-5.c
> > @@ -0,0 +1,67 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +/* { dg-additional-options "-fdump-tree-original-all" } */
> > +
> > +typedef int v4i __attribute__((vector_size(4*sizeof(int))));
> > +
> > +/* Ensure we can simplify `VEC_COND_EXPR(a OP1 b) OP2 VEC_COND_EXPR(a OP3 
> > b)`
> > + * into `VEC_COND_EXPR(a OP4 b)`
> > + */
> > +
> > +void use (v4i const *z);
> > +
> > +void
> > +g (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x > *y | *x == *y; // expect >=
> > +  *t = *x > *y | *x <= *y; // expect true
> > +}
> > +
> > +void
> > +h (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x <= *y & *x >= *y; // expect x == y
> > +  *t = *x <= *y & *x != *y; // expect x<y
> > +}
> > +
> > +void
> > +i (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x == *y | *x != *y; // expect true
> > +  *t = *x == *y & *x != *y; // expect false
> > +}
> > +
> > +void
> > +k (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x < *y | *x == *y;  // x <= y
> > +  *t = *x < *y & *x > *y;   // expect false
> > +}
> > +
> > +void
> > +m (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x <= *y ^ *x >= *y; /* expect x != y */
> > +  *t = *x <= *y ^ *x != *y; /* expect x <= y */
> > +}
> > +
> > +void
> > +n (v4i *x, v4i const *y, v4i *z, v4i *t)
> > +{
> > +  *z = *x == *y ^ *x != *y; /* expect true */
> > +  *t = *x == *y ^ *x == *y; /* expect false */
> > +}
> > +
> > +
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*>=\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*;" "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*==\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*<\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*;" "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*;" "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*<=\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*;" "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*!=\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*VEC_COND_EXPR\\s*<\\s*\\*xD\\.\\d+\\s*>=\\s*VIEW_CONVERT_EXPR<v4iD\\.\\d+>\\(\\*yD\\.\\d+\\)\\s*,\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*,\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*>\\s*;"
> >  "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*zD\\.\\d+\\s*=\\s*\\{\\s*-1(,\\s*-1){3}\\s*\\}\\s*;" "original" } } */
> > +/* { dg-final { scan-tree-dump 
> > ".*\\*tD\\.\\d+\\s*=\\s*\\{\\s*0(,\\s*0){3}\\s*\\}\\s*;" "original" } } */
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to