Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > I have been playing around with making Kewen's partial vector changes > workable with s390: > > We have a vll instruction that can be passed the highest byte to load. > The rather unfortunate consequence of this is that a length of zero > cannot be specified. The partial vector framework, however, relies a > lot on the fact that a len_load can be made a NOP using a length of zero. > > After confirming an additional zero-check before each vll is definitely > too slow across SPEC and some discussion with Kewen we figured the > easiest way forward is to exclude loops with multiple VFs (despite > giving up vectorization possibilities). These are prone to len_loads > with zero while the regular induction variable check prevents them in > single-VF loops. > > So, as a quick hack, I went with > > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index 75f24e7c4f6..f79222daeb6 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -1170,6 +1170,9 @@ vect_verify_loop_lens (loop_vec_info loop_vinfo) > if (LOOP_VINFO_LENS (loop_vinfo).is_empty ()) > return false; > > + if (LOOP_VINFO_LENS (loop_vinfo).length () > 1) > + return false; > +
Yeah, I think this should be sufficient. > which could be made a hook, eventually. FWIW this is sufficient to make > bootstrap, regtest and compiling the SPEC suites succeed. I'm unsure > whether we are guaranteed not to emit len_load with zero now. On top, > I subtract 1 from the passed length in the expander, which, supposedly, > is also not ideal. Exposing the subtraction in gimple would certainly allow for more optimisation. We already have code to probe the predicates of the underlying define_expands/insns to see whether they support certain constant IFN arguments; see e.g. internal_gather_scatter_fn_supported_p. We could do something similar here: add an extra operand to the optab, and an extra argument to the IFN, that gives a bias amount. The PowerPC version would require 0, the System Z version would require -1. The vectoriser would probe to see which value it should use. Doing it that way ensures that the gimple is still self-describing. It avoids gimple semantics depending on target hooks. > There are some regressions that I haven't fully analyzed yet but whether > and when to actually enable this feature could be a backend decision > with the necessary middle-end checks already in place. > > Any ideas on how to properly check for the zero condition and exclude > the cases that cause it? Kewen suggested enriching the len_load optabs > with a separate parameter. Yeah, I think that'd be a good approach. A bias of -1 would indicate that the target can't cope with zero lengths. Thanks, Richard