https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123175
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Tamar Christina from comment #10) > > > Could do so if you want? > > > > The all_from_input_p work if nelts is correct, so this fix seems wrong. For > > the particular pattern I think just initializing nelts from op0 is correct. > > > > Hmm, yeah I needed all_in_range_p for something else (updates to > simplify_vector_constructor) and ended up using it here too. So yeah agreed > it's overcomplicated for this pattern. > > > But as said, I wonder if it was really intended to relax VEC_PERM_EXPR this > > much. I wonder if we even ever get those on non-VLA targets? > > We do, all my optimizations are for Adv. SIMD. On current trunk? > > Going forward I'd like to see a vec_perm_indices CTOR from gassign * > > and tree (for match.pd if the tree one handles SSA name by looking at > > the definition would be convenient) to avoid such issues. > > > > Do you have a non-GIMPLE testcase that shows the issue you are fixing above? > > Well one of the things my patch optimizes is that expansions of 64-bit > permutes are zero extended to 128-bit types today because of the old > restrictions of VEC_PERM_EXPR. > > So GCC generates unneeded zero extensions in all these cases > https://godbolt.org/z/W8MnYP9cr > > In GIMPLE we get > > <bb 2> [local count: 1073741824]: > _3 = {a_2(D), { 0, 0, 0, 0, 0, 0, 0, 0 }}; > _5 = {b_4(D), { 0, 0, 0, 0, 0, 0, 0, 0 }}; > _6 = VEC_PERM_EXPR <_3, _5, { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, > 22, 7, 23 }>; > return _6; > > which is really unneeded. > > One of the patches in the patch series teaches __builtin_shufflevector that > if the target supports 64 -> 128 permutes to not zero extend it. Though > Richard made the point before that perhaps __builtin_shufflevector should > never zero extend and veclower should legitimize it by zero extending then. > In essence we'd have the simplest form in GIMPLE then. Yes, that's how I noticed this issue (teaching __builtin_shufflevector not to zero-extend). Vector lowering also doesn't handle those correctly, so it seems like a can of worms opened ... Both sel.series_p (0, 1, 0, 1) and sel.series_p (0, 1, nelts, 1) seem to match when nelts != nelts_in, that is, even for fixed-lenghts it seems a series is treated as repeating?! That said, I'm going to propose a patch fixing vector lowering and the single match.pd pattern (but possibly fully) with some __GIMPLE test coverage.
