Richard Biener <rguent...@suse.de> writes: > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> Richard Biener <rguent...@suse.de> writes: >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: >> > >> >> Richard Biener <rguent...@suse.de> writes: >> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: >> >> > >> >> >> Richard Biener <rguent...@suse.de> writes: >> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: >> >> >> > >> >> >> >> Hi, >> >> >> >> >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all >> >> >> >> modes for >> >> >> >> epilogues'. The earlier patch assumed there was always at least one >> >> >> >> other mode >> >> >> >> than VOIDmode, but that does not need to be the case. >> >> >> >> If we are dealing with a target that does not define more modes for >> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would >> >> >> >> be to try >> >> >> >> to create an epilogue for the same autodetected_vector_mode, which >> >> >> >> unless the >> >> >> >> target supported partial vectors would always fail. So as a fix I >> >> >> >> suggest >> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for >> >> >> >> QI, >> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is >> >> >> >> not a >> >> >> >> vector_mode (since that already should indicate partial vectors >> >> >> >> aren't >> >> >> >> possible) or if no partial vectors are supported and its >> >> >> >> pessimistic NUNITS is >> >> >> >> larger than the main loop's VF. >> >> >> >> >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for >> >> >> >> trunk? Can >> >> >> >> someone verify this fixes the issue for PR103971 on powerpc? >> >> >> > >> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode >> >> >> > to use for the epilogue? That appears to be a much simpler solution >> >> >> > to >> >> >> > me, including for targets where there are more than one element in >> >> >> > the >> >> >> > vector. >> >> >> >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode >> >> >> will be, so current short-circuit: >> >> >> >> >> >> /* If the target does not support partial vectors we can shorten >> >> >> the >> >> >> number of modes to analyze for the epilogue as we know we >> >> >> can't pick a >> >> >> mode that has at least as many NUNITS as the main loop's >> >> >> vectorization >> >> >> factor, since that would imply the epilogue's vectorization >> >> >> factor >> >> >> would be at least as high as the main loop's and we would be >> >> >> vectorizing for more scalar iterations than there would be >> >> >> left. */ >> >> >> if (!supports_partial_vectors >> >> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), >> >> >> first_vinfo_vf)) >> >> >> { >> >> >> mode_i++; >> >> >> if (mode_i == vector_modes.length ()) >> >> >> break; >> >> >> continue; >> >> >> } >> >> >> >> >> >> wouldn't be effective. >> >> > >> >> > Well, before this change we simply did >> >> > >> >> > - /* Handle the case that the original loop can use partial >> >> > - vectorization, but want to only adopt it for the epilogue. >> >> > - The retry should be in the same mode as original. */ >> >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) >> >> > ... >> >> > - else >> >> > - { >> >> > - mode_i = first_loop_next_i; >> >> > - if (mode_i == vector_modes.length ()) >> >> > - return first_loop_vinfo; >> >> > - } >> >> > >> >> > and thus didn't bother with epilogue vectorization. I think we should >> >> > then just restore this behavior, not doing epilogue vectorization >> >> > if vector_modes.length () == 1? >> >> >> >> Yeah, but that case didn't need epilogue vectorisation before. This >> >> series is adding support for unrolling, and targets with a single vector >> >> size will benefit from epilogues in that case. >> > >> > But in that case (which we could detect), we could then just use >> > autodetected_vector_mode? Like if we do before epilogue vect >> > >> > vector_modes[0] = autodetected_vector_mode; >> > mode_i = 0; >> > >> > thus replace VOIDmode with what we detected and then start at 0? >> > That is, the proposed patch looks very much like a hack to me. >> >> You mean check whether the loop is unrolled? If so, that's what feels >> like a hack to me :-) The question is whether there are enough elements >> for epilogue vectorisation to make sense. The VF is what tells us that. >> Unrolling is just one of the things that influences that VF and I don't >> think we should check for the individual influences. It's just the end >> result that matters. >> >> > I suppose the VECTOR_MODE_P check should be added to >> > >> > if (!supports_partial_vectors >> > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), >> > first_vinfo_vf)) >> > { >> > mode_i++; >> > >> > instead. >> >> You mean: >> >> if (!supports_partial_vectors >> && VECTOR_MODE_P (vector_modes[mode_i]) >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) >> { >> mode_i++; >> >> ? If so, the skip won't be effective the first time round. > > Why? See above where I set vector_modes[0] to autodetected_vector_mode.
Ah, yeah, I guess that works, sorry. It still feels odd to iterate through N+1 modes when we don't need autodetection (and with the above, don't use autodetection), but I can live with it. :-) Another alternative would be to push autodetected_vector_mode when the length is 1 and keep 1 as the starting point. Richard