On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener <rguent...@suse.de> writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener <rguent...@suse.de> writes:
> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >
> >> >> Richard Biener <rguent...@suse.de> writes:
> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >> >
> >> >> >> Hi,
> >> >> >> 
> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes 
> >> >> >> for
> >> >> >> epilogues'. The earlier patch assumed there was always at least one 
> >> >> >> other mode
> >> >> >> than VOIDmode, but that does not need to be the case.
> >> >> >> If we are dealing with a target that does not define more modes for
> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would 
> >> >> >> be to try
> >> >> >> to create an epilogue for the same autodetected_vector_mode, which 
> >> >> >> unless the
> >> >> >> target supported partial vectors would always fail. So as a fix I 
> >> >> >> suggest
> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is 
> >> >> >> not a
> >> >> >> vector_mode (since that already should indicate partial vectors 
> >> >> >> aren't
> >> >> >> possible) or if no partial vectors are supported and its pessimistic 
> >> >> >> NUNITS is
> >> >> >> larger than the main loop's VF.
> >> >> >> 
> >> >> >> Currently bootstrapping and regression testing, otherwise OK for 
> >> >> >> trunk? Can
> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >> >
> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> >> > to use for the epilogue?  That appears to be a much simpler solution 
> >> >> > to
> >> >> > me, including for targets where there are more than one element in the
> >> >> > vector.
> >> >> 
> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> >> will be, so current short-circuit:
> >> >> 
> >> >>       /* If the target does not support partial vectors we can shorten 
> >> >> the
> >> >>          number of modes to analyze for the epilogue as we know we 
> >> >> can't pick a
> >> >>          mode that has at least as many NUNITS as the main loop's 
> >> >> vectorization
> >> >>          factor, since that would imply the epilogue's vectorization 
> >> >> factor
> >> >>          would be at least as high as the main loop's and we would be
> >> >>          vectorizing for more scalar iterations than there would be 
> >> >> left.  */
> >> >>       if (!supports_partial_vectors
> >> >>           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> >> >> first_vinfo_vf))
> >> >>         {
> >> >>           mode_i++;
> >> >>           if (mode_i == vector_modes.length ())
> >> >>             break;
> >> >>           continue;
> >> >>         }
> >> >> 
> >> >> wouldn't be effective.
> >> >
> >> > Well, before this change we simply did
> >> >
> >> > -  /* Handle the case that the original loop can use partial
> >> > -     vectorization, but want to only adopt it for the epilogue.
> >> > -     The retry should be in the same mode as original.  */
> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >> > ...
> >> > -  else
> >> > -    {
> >> > -      mode_i = first_loop_next_i;
> >> > -      if (mode_i == vector_modes.length ())
> >> > -       return first_loop_vinfo;
> >> > -    }
> >> >
> >> > and thus didn't bother with epilogue vectorization.  I think we should
> >> > then just restore this behavior, not doing epilogue vectorization
> >> > if vector_modes.length () == 1?
> >> 
> >> Yeah, but that case didn't need epilogue vectorisation before.  This
> >> series is adding support for unrolling, and targets with a single vector
> >> size will benefit from epilogues in that case.
> >
> > But in that case (which we could detect), we could then just use
> > autodetected_vector_mode?  Like if we do before epilogue vect
> >
> >  vector_modes[0] = autodetected_vector_mode;
> >  mode_i = 0;
> >
> > thus replace VOIDmode with what we detected and then start at 0?
> > That is, the proposed patch looks very much like a hack to me.
> 
> You mean check whether the loop is unrolled?  If so, that's what feels
> like a hack to me :-)  The question is whether there are enough elements
> for epilogue vectorisation to make sense.  The VF is what tells us that.
> Unrolling is just one of the things that influences that VF and I don't
> think we should check for the individual influences.  It's just the end
> result that matters.
> 
> > I suppose the VECTOR_MODE_P check should be added to
> >
> >       if (!supports_partial_vectors
> >           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> > first_vinfo_vf))
> >         {
> >           mode_i++;
> >
> > instead.
> 
> You mean:
> 
>       if (!supports_partial_vectors
>         && VECTOR_MODE_P (vector_modes[mode_i])
>         && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>       {
>         mode_i++;
> 
> ?  If so, the skip won't be effective the first time round.

Why?  See above where I set vector_modes[0] to autodetected_vector_mode.

Richard.

Reply via email to