https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123175

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #10)
> > > Could do so if you want?
> > 
> > The all_from_input_p work if nelts is correct, so this fix seems wrong.  For
> > the particular pattern I think just initializing nelts from op0 is correct.
> > 
> 
> Hmm, yeah I needed all_in_range_p for something else (updates to
> simplify_vector_constructor) and ended up using it here too.  So yeah agreed
> it's overcomplicated for this pattern.
> 
> > But as said, I wonder if it was really intended to relax VEC_PERM_EXPR this
> > much.  I wonder if we even ever get those on non-VLA targets?
> 
> We do, all my optimizations are for Adv. SIMD.

On current trunk?

> > Going forward I'd like to see a vec_perm_indices CTOR from gassign *
> > and tree (for match.pd if the tree one handles SSA name by looking at
> > the definition would be convenient) to avoid such issues.
> > 
> > Do you have a non-GIMPLE testcase that shows the issue you are fixing above?
> 
> Well one of the things my patch optimizes is that expansions of 64-bit
> permutes are zero extended to 128-bit types today because of the old
> restrictions of VEC_PERM_EXPR.
> 
> So GCC generates unneeded zero extensions in all these cases
> https://godbolt.org/z/W8MnYP9cr
> 
> In GIMPLE we get
> 
>   <bb 2> [local count: 1073741824]:
>   _3 = {a_2(D), { 0, 0, 0, 0, 0, 0, 0, 0 }};
>   _5 = {b_4(D), { 0, 0, 0, 0, 0, 0, 0, 0 }};
>   _6 = VEC_PERM_EXPR <_3, _5, { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6,
> 22, 7, 23 }>;
>   return _6;
> 
> which is really unneeded.
> 
> One of the patches in the patch series teaches __builtin_shufflevector that
> if the target supports 64 -> 128 permutes to not zero extend it.  Though
> Richard made the point before that perhaps __builtin_shufflevector should
> never zero extend and veclower should legitimize it by zero extending then. 
> In essence we'd have the simplest form in GIMPLE then.

Yes, that's how I noticed this issue (teaching __builtin_shufflevector not to
zero-extend).

Vector lowering also doesn't handle those correctly, so it seems like a
can of worms opened ...

Both sel.series_p (0, 1, 0, 1) and sel.series_p (0, 1, nelts, 1)

seem to match when nelts != nelts_in, that is, even for fixed-lenghts
it seems a series is treated as repeating?!

That said, I'm going to propose a patch fixing vector lowering and the single
match.pd pattern (but possibly fully) with some __GIMPLE test coverage.

Reply via email to