On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
<richard.sandif...@linaro.org> wrote:
> "Bin.Cheng" <amker.ch...@gmail.com> writes:
>> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
>>> richard.sandif...@linaro.org> wrote:
>>>
>>>> Richard Biener <richard.guent...@gmail.com> writes:
>>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote:
>>>> >
>>>> >> Hi,
>>>> >> As reported in PR85804, bump step is wrongly computed for vector(1)
>>> load
>>>> > of
>>>> >> single-element group access.  This patch fixes the issue by correcting
>>>> > bump
>>>> >> step computation for the specific VMAT_CONTIGUOUS case.
>>>> >
>>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>>>> >
>>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>>>> > We'd fall into the grouped_load case otherwise which should handle
>>>> > the situation correctly?
>>>> >
>>>> > Richard?
>>>
>>>> Yeah, I agree.  I mentioned to Bin privately that that was probably
>>>> a misstep and that we should instead continue to treat them as
>>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>>>> from the array of loaded vectors, instead of doing an actual permute.
>>>
>>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
>>> should be only for no-gap vectors.  How do we classify single-element
>>> interleaving?  That would be another classification choice.
>> Yes, I suggested this offline too, but Richard may have more to say about 
>> this.
>> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
>> disable vectorization in this case because of cost model issue as
>> commented at the end of get_load_store_type.
>
> Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
> use a scalar load and then insert it into a vector, whereas all we want
> (and all we currently generate) is a single vector load.
>
> So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
> for both costing (vector load rather than scalar load and vector
> construct) and code-generation.
Looks to me it will be a special case for VMAT_ELEMENTWISE or
VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?

Thanks,
bin
>
> Thanks,
> Richard

Reply via email to