On Wed, 30 Aug 2017, Jon Beniston wrote: > Hi Richard, > > >> Ah, that's what I first tried, and mistakenly sent the patch against > that. > >> > >> That did help the initial problem, but then I needed to add a lot of > >> support for the V1SI type in the backend (which just duplicated SI > >> patterns) and there are a few places where GCC seems to have sanity > >> checks against vectors that are only one element wide. A dot product > >> producing a scalar result seems natural. > > > >Where do you need V1SI types? The vectorizer should perform a SI extract > >from V1SI here. You need to elaborate a bit here. > > After adding single element vector type support in the middle-end, at the > tree level, V1SI vector types would be the result type for the dot product > if the input operands were V2HI. > > During RTL expansion, if V1SImode is accepted in the > TARGET_VECTOR_MODE_SUPPORTED_P hook, then V1SImode will be used after RTL > expansion and V1SImode patterns are needed, although they are actually > duplicates of SImode patterns. I have now removed V1SImode from > TARGET_VECTOR_MODE_SUPPORTED_P, and they are turned into SI mode so there is > no need for the V1SI patterns now. > > >> The vectorizer doesn't really support vector->scalar ops so V1SI > >> feels more natural here. That is, your patch looks a bit ugly -- > >> there's nothing wrong with V1SI vector types. > > I agree "there's nothing wrong with V1SI vector types" and the original > patch was trying to let vector reduction operation accept V1SI vector types. > > However, if the only change was "<= 1" into "< 1" in > get_vectype_for_scalar_type_and_size, then it will cause a GCC assertion > failure in optab_for_tree_node for some shift operations. It seems all such > failures come from: > > vect_pattern_recog_1:4185 > optab = optab_for_tree_code (code, type_in, optab_default); > > The SUB_TYPE passed to optab_for_tree_code is "optab_default", however it is > possible that vect_pattern_recog_1 will generate gimple containing shift > operations, for example vect_recog_divmod_pattern will generate RSHIFT_EXPR, > while shift operation requires sub_type to be not optab_default?
I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false for V1SI you'll get a SImode vector type and the if (VECTOR_BOOLEAN_TYPE_P (type_in) || VECTOR_MODE_P (TYPE_MODE (type_in))) check fails. Try changing that to || VECTOR_TYPE_P (type_in). The else path should be hopefully dead ... With your TARGET_VECTOR_MODE_SUPPORTED_P setting you are essentially making the vectorizer mix what was desired as "generic" vectorization and regular vectorization... if it works, fine ;) > gcc/optabs-tree.h: > > /* An extra flag to control optab_for_tree_code's behavior. This is > needed to > distinguish between machines with a vector shift that takes a scalar > for the > shift amount vs. machines that take a vector for the shift amount. */ > > enum optab_subtype > { > optab_default, > optab_scalar, > optab_vector > }; > > It looks to me like the other call sites of optab_for_tree_code which are > passing optab_default are mostly places where GCC is sure it is not a shift > operation. > Given TYPE_IN is returned from get_vectype_for_scalar_type, is it safe to > simply pass optab_vector in vect_pattern_recog_1? I guess so. Can you also try the above? > I have verified the following middle-end fix also works for my port, it > passed also x86-64 bootstrap and there is no regression in gcc/g++ > regression. > > How is this new patch? > > gcc/ > 2017-08-30 Jon Beniston <j...@beniston.com> > > * tree-vect-patterns.c (vect_pattern_recog_1): Pass optab_vector > when > calling optab_for_tree_code. > * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Allow > single > element vector types. > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c > index cfdb72c..4b39cb6 100644 > --- a/gcc/tree-vect-patterns.c > +++ b/gcc/tree-vect-patterns.c > @@ -4182,7 +4182,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func, > code = CALL_EXPR; > } > > - optab = optab_for_tree_code (code, type_in, optab_default); > + optab = optab_for_tree_code (code, type_in, optab_vector); > vec_mode = TYPE_MODE (type_in); > if (!optab > || (icode = optab_handler (optab, vec_mode)) == CODE_FOR_nothing > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 013fb1f..fc62efb 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree > scalar_type, unsigned size) > else > simd_mode = mode_for_vector (inner_mode, size / nbytes); > nunits = GET_MODE_SIZE (simd_mode) / nbytes; > - if (nunits <= 1) > + /* NOTE: nunits == 1 is allowed to support single element vector types. > */ > + if (nunits < 1) > return NULL_TREE; > > vectype = build_vector_type (scalar_type, nunits);