On Thu, Mar 2, 2023 at 11:19 AM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> vectorizable_condition checks whether a COND_EXPR condition is used
> elsewhere with a loop mask.  If so, it applies the loop mask to the
> COND_EXPR too, to reduce the number of live masks and to increase the
> chance of combining the AND with the comparison.
>
> There is also code to do this for inverted conditions.  E.g. if
> we have a < b ? c : d and something else is conditional on !(a < b)
> (such as a load in d), we use !(a < b) ? d : c and apply the loop
> mask to !(a < b).
>
> This inversion relied on the function's bitop1/bitop2 mechanism.
> However, that mechanism is skipped if the condition is split out of
> the COND_EXPR as a separate statement.  This meant that we could end
> up using the inverse of the intended condition.
>
> There is a separate way of negating the condition when a mask
> is being applied (which is also used for EXTRACT_LAST reductions).
> This patch uses that instead.
>
> As well as the testcase, this fixes aarch64/sve/vcond_{4,17}_run.c.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

> Richard
>
>
> gcc/
>         PR tree-optimization/108430
>         * tree-vect-stmts.cc (vectorizable_condition): Fix handling
>         of inverted condition.
>
> gcc/testsuite/
>         PR tree-optimization/108430
>         * gcc.target/aarch64/sve/pr108430.c: New test.
> ---
>  .../gcc.target/aarch64/sve/pr108430.c         | 21 +++++++++++++++++++
>  gcc/tree-vect-stmts.cc                        |  3 +--
>  2 files changed, 22 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr108430.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c
> new file mode 100644
> index 00000000000..e7ce0f6d793
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c
> @@ -0,0 +1,21 @@
> +/* { dg-do run { target aarch64_sve512_hw } } */
> +/* { dg-options "-O3 -msve-vector-bits=512" } */
> +
> +long d = 1;
> +static int i = 37;
> +static unsigned long a[22];
> +static unsigned short c[22];
> +static unsigned g[80];
> +static unsigned short *h = c;
> +static unsigned long *j = a;
> +int main() {
> +  for (long m = 0; m < 8; ++m)
> +    d = 1;
> +  for (unsigned char p = 0; p < 17; p += 2)
> +  {
> +    long t = h[p] ? i : j[p];
> +    g[p] = t;
> +  }
> +  if (g[0])
> +    __builtin_abort ();
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 9e5ffbe252e..77ad8b78506 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -10756,11 +10756,10 @@ vectorizable_condition (vec_info *vinfo,
>                   cond.code = orig_code;
>                   if (loop_vinfo->scalar_cond_masked_set.contains (cond))
>                     {
> -                     bitop1 = orig_code;
> -                     bitop2 = BIT_NOT_EXPR;
>                       masks = &LOOP_VINFO_MASKS (loop_vinfo);
>                       cond_code = cond.code;
>                       swap_cond_operands = true;
> +                     must_invert_cmp_result = true;
>                     }
>                 }
>             }
> --
> 2.25.1
>

Reply via email to