On Wed, May 23, 2018 at 12:12 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Wed, May 23, 2018 at 1:10 PM Bin.Cheng <amker.ch...@gmail.com> wrote: > >> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >> > "Bin.Cheng" <amker.ch...@gmail.com> writes: >> >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener >> >> <richard.guent...@gmail.com> wrote: >> >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford < >> >>> richard.sandif...@linaro.org> wrote: >> >>> >> >>>> Richard Biener <richard.guent...@gmail.com> writes: >> >>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> > wrote: >> >>>> > >> >>>> >> Hi, >> >>>> >> As reported in PR85804, bump step is wrongly computed for > vector(1) >> >>> load >> >>>> > of >> >>>> >> single-element group access. This patch fixes the issue by > correcting >> >>>> > bump >> >>>> >> step computation for the specific VMAT_CONTIGUOUS case. >> >>>> > >> >>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK? >> >>>> > >> >>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus. >> >>>> > We'd fall into the grouped_load case otherwise which should handle >> >>>> > the situation correctly? >> >>>> > >> >>>> > Richard? >> >>> >> >>>> Yeah, I agree. I mentioned to Bin privately that that was probably >> >>>> a misstep and that we should instead continue to treat them as >> >>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector >> >>>> from the array of loaded vectors, instead of doing an actual permute. >> >>> >> >>> I'd classify them as VMAT_ELEMENTWISE instead. CONTIGUOUS >> >>> should be only for no-gap vectors. How do we classify single-element >> >>> interleaving? That would be another classification choice. >> >> Yes, I suggested this offline too, but Richard may have more to say > about this. >> >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would >> >> disable vectorization in this case because of cost model issue as >> >> commented at the end of get_load_store_type. >> > >> > Yeah, that's the problem. Using VMAT_ELEMENTWISE also means that we >> > use a scalar load and then insert it into a vector, whereas all we want >> > (and all we currently generate) is a single vector load. >> > >> > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case >> > for both costing (vector load rather than scalar load and vector >> > construct) and code-generation. >> Looks to me it will be a special case for VMAT_ELEMENTWISE or >> VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one? > > The question is why we do > > if (memory_access_type == VMAT_GATHER_SCATTER > || (!slp && memory_access_type == VMAT_CONTIGUOUS)) > grouped_load = false; > > I think for the particular testcase removing this would fix the issue as No, simply relaxing this check would result in generating redundant loads (for gap). Also it leads to vect_transform_grouped_load as well as vect_permute_load_chain, so we would need to do the special case handling just like classifying as VMAT_CONTIGUOUS_PERMUTE.
Thanks, bin > well? > > Richard. > >> Thanks, >> bin >> > >> > Thanks, >> > Richard