On Tue, 27 Apr 2021, Jakub Jelinek wrote: > Hi! > > The following testcase ICEs at -O0, because lower_vec_perm sees the > _1 = { 0, 0, 0, 0, 0, 0, 0, 0 }; > _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, > 0, 0, 0, 0 }>; > _3 = { 6, 0, 0, 0, 0, 0, 0, 0 }; > _4 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _2, _3>; > and as the ISA is SSE2, there is no support for the particular permutation > nor for variable mask permutation. But, the code to match vec_shl matches > it, because the permutation has the first operand a zero vector and the > mask picks all elements randomly from that vector. > So, in the end that isn't a vec_shl, but the permutation could be in theory > optimized into the first argument. As we keep it as is, it will fail > during expansion though, because that for vec_shl correctly requires that > it actually is a shift: > unsigned firstidx = 0; > for (unsigned int i = 0; i < nelt; i++) > { > if (known_eq (sel[i], nelt)) > { > if (i == 0 || firstidx) > return NULL_RTX; > firstidx = i; > } > else if (firstidx > ? maybe_ne (sel[i], nelt + i - firstidx) > : maybe_ge (sel[i], nelt)) > return NULL_RTX; > } > > if (firstidx == 0) > return NULL_RTX; > first = firstidx; > The if (firstidx == 0) return NULL; is what is missing a counterpart > on the lower_vec_perm side. > As with optimize != 0 we fold it in other spots, I think it is not needed > to optimize this cornercase in lower_vec_perm (which would mean we'd need > to recurse on the newly created _4 = { 0, 0, 0, 0, 0, 0, 0, 0 }; > whether it is supported or not). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK. Thanks, Richard. > 2021-04-27 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/100239 > * tree-vect-generic.c (lower_vec_perm): Don't accept constant > permutations with all indices from the first zero element as vec_shl. > > * gcc.dg/pr100239.c: New test. > > --- gcc/tree-vect-generic.c.jj 2021-01-27 19:30:20.763625450 +0100 > +++ gcc/tree-vect-generic.c 2021-04-26 14:41:37.909432994 +0200 > @@ -1515,7 +1515,7 @@ lower_vec_perm (gimple_stmt_iterator *gs > elements + i - first) > : maybe_ge (poly_uint64 (indices[i]), elements)) > break; > - if (i == elements) > + if (first && i == elements) > { > gimple_assign_set_rhs3 (stmt, mask); > update_stmt (stmt); > --- gcc/testsuite/gcc.dg/pr100239.c.jj 2021-04-26 14:45:28.517819255 > +0200 > +++ gcc/testsuite/gcc.dg/pr100239.c 2021-04-26 14:45:09.985029312 +0200 > @@ -0,0 +1,12 @@ > +/* PR tree-optimization/100239 */ > +/* { dg-do compile } */ > +/* { dg-options "-O0" } */ > + > +typedef short __attribute__((__vector_size__ (8 * sizeof (short)))) V; > +V v, w; > + > +void > +foo (void) > +{ > + w = __builtin_shuffle (v != v, 0 < (V) {}, (V) {192} >> 5); > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)