On Wed, 12 Jan 2022, Richard Sandiford wrote: > Richard Biener <rguent...@suse.de> writes: > > On Wed, 12 Jan 2022, Richard Sandiford wrote: > > > >> Richard Biener <rguent...@suse.de> writes: > >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> > > >> >> Richard Biener <rguent...@suse.de> writes: > >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> >> > > >> >> >> Hi, > >> >> >> > >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes > >> >> >> for > >> >> >> epilogues'. The earlier patch assumed there was always at least one > >> >> >> other mode > >> >> >> than VOIDmode, but that does not need to be the case. > >> >> >> If we are dealing with a target that does not define more modes for > >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would > >> >> >> be to try > >> >> >> to create an epilogue for the same autodetected_vector_mode, which > >> >> >> unless the > >> >> >> target supported partial vectors would always fail. So as a fix I > >> >> >> suggest > >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI, > >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is > >> >> >> not a > >> >> >> vector_mode (since that already should indicate partial vectors > >> >> >> aren't > >> >> >> possible) or if no partial vectors are supported and its pessimistic > >> >> >> NUNITS is > >> >> >> larger than the main loop's VF. > >> >> >> > >> >> >> Currently bootstrapping and regression testing, otherwise OK for > >> >> >> trunk? Can > >> >> >> someone verify this fixes the issue for PR103971 on powerpc? > >> >> > > >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode > >> >> > to use for the epilogue? That appears to be a much simpler solution > >> >> > to > >> >> > me, including for targets where there are more than one element in the > >> >> > vector. > >> >> > >> >> VOIDmode doesn't tell us anything about what the autodetected mode > >> >> will be, so current short-circuit: > >> >> > >> >> /* If the target does not support partial vectors we can shorten > >> >> the > >> >> number of modes to analyze for the epilogue as we know we > >> >> can't pick a > >> >> mode that has at least as many NUNITS as the main loop's > >> >> vectorization > >> >> factor, since that would imply the epilogue's vectorization > >> >> factor > >> >> would be at least as high as the main loop's and we would be > >> >> vectorizing for more scalar iterations than there would be > >> >> left. */ > >> >> if (!supports_partial_vectors > >> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > >> >> first_vinfo_vf)) > >> >> { > >> >> mode_i++; > >> >> if (mode_i == vector_modes.length ()) > >> >> break; > >> >> continue; > >> >> } > >> >> > >> >> wouldn't be effective. > >> > > >> > Well, before this change we simply did > >> > > >> > - /* Handle the case that the original loop can use partial > >> > - vectorization, but want to only adopt it for the epilogue. > >> > - The retry should be in the same mode as original. */ > >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > >> > ... > >> > - else > >> > - { > >> > - mode_i = first_loop_next_i; > >> > - if (mode_i == vector_modes.length ()) > >> > - return first_loop_vinfo; > >> > - } > >> > > >> > and thus didn't bother with epilogue vectorization. I think we should > >> > then just restore this behavior, not doing epilogue vectorization > >> > if vector_modes.length () == 1? > >> > >> Yeah, but that case didn't need epilogue vectorisation before. This > >> series is adding support for unrolling, and targets with a single vector > >> size will benefit from epilogues in that case. > > > > But in that case (which we could detect), we could then just use > > autodetected_vector_mode? Like if we do before epilogue vect > > > > vector_modes[0] = autodetected_vector_mode; > > mode_i = 0; > > > > thus replace VOIDmode with what we detected and then start at 0? > > That is, the proposed patch looks very much like a hack to me. > > You mean check whether the loop is unrolled? If so, that's what feels > like a hack to me :-) The question is whether there are enough elements > for epilogue vectorisation to make sense. The VF is what tells us that. > Unrolling is just one of the things that influences that VF and I don't > think we should check for the individual influences. It's just the end > result that matters. > > > I suppose the VECTOR_MODE_P check should be added to > > > > if (!supports_partial_vectors > > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > > first_vinfo_vf)) > > { > > mode_i++; > > > > instead. > > You mean: > > if (!supports_partial_vectors > && VECTOR_MODE_P (vector_modes[mode_i]) > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) > { > mode_i++; > > ? If so, the skip won't be effective the first time round.
Why? See above where I set vector_modes[0] to autodetected_vector_mode. Richard.