Richard Biener <rguent...@suse.de> writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener <rguent...@suse.de> writes:
>> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >
>> >> Richard Biener <rguent...@suse.de> writes:
>> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >> >
>> >> >> Richard Biener <rguent...@suse.de> writes:
>> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >> 
>> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all 
>> >> >> >> modes for
>> >> >> >> epilogues'. The earlier patch assumed there was always at least one 
>> >> >> >> other mode
>> >> >> >> than VOIDmode, but that does not need to be the case.
>> >> >> >> If we are dealing with a target that does not define more modes for
>> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would 
>> >> >> >> be to try
>> >> >> >> to create an epilogue for the same autodetected_vector_mode, which 
>> >> >> >> unless the
>> >> >> >> target supported partial vectors would always fail. So as a fix I 
>> >> >> >> suggest
>> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for 
>> >> >> >> QI,
>> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is 
>> >> >> >> not a
>> >> >> >> vector_mode (since that already should indicate partial vectors 
>> >> >> >> aren't
>> >> >> >> possible) or if no partial vectors are supported and its 
>> >> >> >> pessimistic NUNITS is
>> >> >> >> larger than the main loop's VF.
>> >> >> >> 
>> >> >> >> Currently bootstrapping and regression testing, otherwise OK for 
>> >> >> >> trunk? Can
>> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >> >> >
>> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> >> >> > to use for the epilogue?  That appears to be a much simpler solution 
>> >> >> > to
>> >> >> > me, including for targets where there are more than one element in 
>> >> >> > the
>> >> >> > vector.
>> >> >> 
>> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
>> >> >> will be, so current short-circuit:
>> >> >> 
>> >> >>       /* If the target does not support partial vectors we can shorten 
>> >> >> the
>> >> >>         number of modes to analyze for the epilogue as we know we 
>> >> >> can't pick a
>> >> >>         mode that has at least as many NUNITS as the main loop's 
>> >> >> vectorization
>> >> >>         factor, since that would imply the epilogue's vectorization 
>> >> >> factor
>> >> >>         would be at least as high as the main loop's and we would be
>> >> >>         vectorizing for more scalar iterations than there would be 
>> >> >> left.  */
>> >> >>       if (!supports_partial_vectors
>> >> >>          && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
>> >> >> first_vinfo_vf))
>> >> >>        {
>> >> >>          mode_i++;
>> >> >>          if (mode_i == vector_modes.length ())
>> >> >>            break;
>> >> >>          continue;
>> >> >>        }
>> >> >> 
>> >> >> wouldn't be effective.
>> >> >
>> >> > Well, before this change we simply did
>> >> >
>> >> > -  /* Handle the case that the original loop can use partial
>> >> > -     vectorization, but want to only adopt it for the epilogue.
>> >> > -     The retry should be in the same mode as original.  */
>> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
>> >> > ...
>> >> > -  else
>> >> > -    {
>> >> > -      mode_i = first_loop_next_i;
>> >> > -      if (mode_i == vector_modes.length ())
>> >> > -       return first_loop_vinfo;
>> >> > -    }
>> >> >
>> >> > and thus didn't bother with epilogue vectorization.  I think we should
>> >> > then just restore this behavior, not doing epilogue vectorization
>> >> > if vector_modes.length () == 1?
>> >> 
>> >> Yeah, but that case didn't need epilogue vectorisation before.  This
>> >> series is adding support for unrolling, and targets with a single vector
>> >> size will benefit from epilogues in that case.
>> >
>> > But in that case (which we could detect), we could then just use
>> > autodetected_vector_mode?  Like if we do before epilogue vect
>> >
>> >  vector_modes[0] = autodetected_vector_mode;
>> >  mode_i = 0;
>> >
>> > thus replace VOIDmode with what we detected and then start at 0?
>> > That is, the proposed patch looks very much like a hack to me.
>> 
>> You mean check whether the loop is unrolled?  If so, that's what feels
>> like a hack to me :-)  The question is whether there are enough elements
>> for epilogue vectorisation to make sense.  The VF is what tells us that.
>> Unrolling is just one of the things that influences that VF and I don't
>> think we should check for the individual influences.  It's just the end
>> result that matters.
>> 
>> > I suppose the VECTOR_MODE_P check should be added to
>> >
>> >       if (!supports_partial_vectors
>> >           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
>> > first_vinfo_vf))
>> >         {
>> >           mode_i++;
>> >
>> > instead.
>> 
>> You mean:
>> 
>>       if (!supports_partial_vectors
>>        && VECTOR_MODE_P (vector_modes[mode_i])
>>        && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>>      {
>>        mode_i++;
>> 
>> ?  If so, the skip won't be effective the first time round.
>
> Why?  See above where I set vector_modes[0] to autodetected_vector_mode.

Ah, yeah, I guess that works, sorry.  It still feels odd to iterate
through N+1 modes when we don't need autodetection (and with the above,
don't use autodetection), but I can live with it. :-)

Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.

Richard

Reply via email to