On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > "Bin.Cheng" <amker.ch...@gmail.com> writes: >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford < >>> richard.sandif...@linaro.org> wrote: >>> >>>> Richard Biener <richard.guent...@gmail.com> writes: >>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote: >>>> > >>>> >> Hi, >>>> >> As reported in PR85804, bump step is wrongly computed for vector(1) >>> load >>>> > of >>>> >> single-element group access. This patch fixes the issue by correcting >>>> > bump >>>> >> step computation for the specific VMAT_CONTIGUOUS case. >>>> > >>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK? >>>> > >>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus. >>>> > We'd fall into the grouped_load case otherwise which should handle >>>> > the situation correctly? >>>> > >>>> > Richard? >>> >>>> Yeah, I agree. I mentioned to Bin privately that that was probably >>>> a misstep and that we should instead continue to treat them as >>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector >>>> from the array of loaded vectors, instead of doing an actual permute. >>> >>> I'd classify them as VMAT_ELEMENTWISE instead. CONTIGUOUS >>> should be only for no-gap vectors. How do we classify single-element >>> interleaving? That would be another classification choice. >> Yes, I suggested this offline too, but Richard may have more to say about >> this. >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would >> disable vectorization in this case because of cost model issue as >> commented at the end of get_load_store_type. > > Yeah, that's the problem. Using VMAT_ELEMENTWISE also means that we > use a scalar load and then insert it into a vector, whereas all we want > (and all we currently generate) is a single vector load. > > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case > for both costing (vector load rather than scalar load and vector > construct) and code-generation. Looks to me it will be a special case for VMAT_ELEMENTWISE or VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?
Thanks, bin > > Thanks, > Richard