On Wed, 31 Mar 2021, Stam Markianos-Wright wrote: > On 29/03/2021 10:20, Richard Biener wrote: > > On Fri, 26 Mar 2021, Richard Sandiford wrote: > > > >> Richard Biener <rguent...@suse.de> writes: > >>> On Wed, 24 Mar 2021, Stam Markianos-Wright wrote: > >>> > >>>> Hi all, > >>>> > >>>> This patch resolves bug: > >>>> > >>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 > >>>> > >>>> This is achieved by forcing a re-calculation of *stmt_vectype_out if an > >>>> incompatible combination of TYPE_VECTOR_SUBPARTS is detected, but with an > >>>> extra introduced max_nunits ceiling. > >>>> > >>>> I am not 100% sure if this is the best way to go about fixing this, > >>>> because > >>>> this is my first look at the vectorizer and I lack knowledge of the wider > >>>> context, so do let me know if you see a better way to do this! > >>>> > >>>> I have added the previously ICE-ing reproducer as a new test. > >>>> > >>>> This is compiled as "g++ -Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" > >>>> for > >>>> GCC11 and "g++ -Ofast -march=armv8.2-a+sve" for GCC10. > >>>> > >>>> (the non-fdisable-tree-fre4 version has gone latent on GCC11) > >>>> > >>>> Bootstrapped and reg-tested on aarch64-linux-gnu. > >>>> Also reg-tested on aarch64-none-elf. > >>> > >>> I don't think this is going to work well given uses will expect > >>> a vector type that's consistent here. > >>> > >>> I think giving up is for the moment the best choice, thus replacing > >>> the assert with vectorization failure. > >>> > >>> In the end we shouldn't require those nunits vectypes to be > >>> separately computed - we compute the vector type of the defs > >>> anyway and in case they're invariant the vectorizable_* function > >>> either can deal with the type mix or not anyway. > >> > >> I agree this area needs simplification, but I think the direction of > >> travel should be to make the assert valid. I agree this is probably > >> the pragmatic fix for GCC 11 and earlier though. > > > > The issue is that we compute a vector type for a use that may differ > > from what we'd compute for it in the context of its definition (or > > in the context of another use). Any such "local" decision is likely > > flawed and I'd rather simplify further doing the only decision on > > the definition side - if there's a disconnect between the number > > of lanes (and thus altering the VF won't help) then we have to give > > up anyway. > > > > Richard. > > > > Thank you both for the further info! Would it be fair to close the initial PR > regarding the ICE (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974) and > then open a second one at a lower priority level to address these further > improvements?
I have closed the original report. If there's a testcase where vectorization is possible and useful but is now rejected because of the fix then yes, please open a new missed-optimization bugreport. Richard.