On Tue, 27 Apr 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs at -O0, because lower_vec_perm sees the
>   _1 = { 0, 0, 0, 0, 0, 0, 0, 0 };
>   _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 
> 0, 0, 0, 0 }>;
>   _3 = { 6, 0, 0, 0, 0, 0, 0, 0 };
>   _4 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _2, _3>;
> and as the ISA is SSE2, there is no support for the particular permutation
> nor for variable mask permutation.  But, the code to match vec_shl matches
> it, because the permutation has the first operand a zero vector and the
> mask picks all elements randomly from that vector.
> So, in the end that isn't a vec_shl, but the permutation could be in theory
> optimized into the first argument.  As we keep it as is, it will fail
> during expansion though, because that for vec_shl correctly requires that
> it actually is a shift:
>       unsigned firstidx = 0;
>       for (unsigned int i = 0; i < nelt; i++)
>         {
>           if (known_eq (sel[i], nelt))
>             {
>               if (i == 0 || firstidx)
>                 return NULL_RTX;
>               firstidx = i;
>             }
>           else if (firstidx
>                    ? maybe_ne (sel[i], nelt + i - firstidx)
>                    : maybe_ge (sel[i], nelt))
>             return NULL_RTX;
>         }
> 
>       if (firstidx == 0)
>         return NULL_RTX;
>       first = firstidx;
> The if (firstidx == 0) return NULL; is what is missing a counterpart
> on the lower_vec_perm side.
> As with optimize != 0 we fold it in other spots, I think it is not needed
> to optimize this cornercase in lower_vec_perm (which would mean we'd need
> to recurse on the newly created _4 = { 0, 0, 0, 0, 0, 0, 0, 0 };
> whether it is supported or not).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-04-27  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/100239
>       * tree-vect-generic.c (lower_vec_perm): Don't accept constant
>       permutations with all indices from the first zero element as vec_shl.
> 
>       * gcc.dg/pr100239.c: New test.
> 
> --- gcc/tree-vect-generic.c.jj        2021-01-27 19:30:20.763625450 +0100
> +++ gcc/tree-vect-generic.c   2021-04-26 14:41:37.909432994 +0200
> @@ -1515,7 +1515,7 @@ lower_vec_perm (gimple_stmt_iterator *gs
>                                             elements + i - first)
>                    : maybe_ge (poly_uint64 (indices[i]), elements))
>             break;
> -       if (i == elements)
> +       if (first && i == elements)
>           {
>             gimple_assign_set_rhs3 (stmt, mask);
>             update_stmt (stmt);
> --- gcc/testsuite/gcc.dg/pr100239.c.jj        2021-04-26 14:45:28.517819255 
> +0200
> +++ gcc/testsuite/gcc.dg/pr100239.c   2021-04-26 14:45:09.985029312 +0200
> @@ -0,0 +1,12 @@
> +/* PR tree-optimization/100239 */
> +/* { dg-do compile } */
> +/* { dg-options "-O0" } */
> +
> +typedef short __attribute__((__vector_size__ (8 * sizeof (short)))) V;
> +V v, w;
> +
> +void
> +foo (void)
> +{
> +  w = __builtin_shuffle (v != v, 0 < (V) {}, (V) {192} >> 5);
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to