On Wed, May 23, 2018 at 12:12 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Wed, May 23, 2018 at 1:10 PM Bin.Cheng <amker.ch...@gmail.com> wrote:
>
>> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
>> <richard.sandif...@linaro.org> wrote:
>> > "Bin.Cheng" <amker.ch...@gmail.com> writes:
>> >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
>> >> <richard.guent...@gmail.com> wrote:
>> >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
>> >>> richard.sandif...@linaro.org> wrote:
>> >>>
>> >>>> Richard Biener <richard.guent...@gmail.com> writes:
>> >>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com>
> wrote:
>> >>>> >
>> >>>> >> Hi,
>> >>>> >> As reported in PR85804, bump step is wrongly computed for
> vector(1)
>> >>> load
>> >>>> > of
>> >>>> >> single-element group access.  This patch fixes the issue by
> correcting
>> >>>> > bump
>> >>>> >> step computation for the specific VMAT_CONTIGUOUS case.
>> >>>> >
>> >>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>> >>>> >
>> >>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>> >>>> > We'd fall into the grouped_load case otherwise which should handle
>> >>>> > the situation correctly?
>> >>>> >
>> >>>> > Richard?
>> >>>
>> >>>> Yeah, I agree.  I mentioned to Bin privately that that was probably
>> >>>> a misstep and that we should instead continue to treat them as
>> >>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>> >>>> from the array of loaded vectors, instead of doing an actual permute.
>> >>>
>> >>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
>> >>> should be only for no-gap vectors.  How do we classify single-element
>> >>> interleaving?  That would be another classification choice.
>> >> Yes, I suggested this offline too, but Richard may have more to say
> about this.
>> >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
>> >> disable vectorization in this case because of cost model issue as
>> >> commented at the end of get_load_store_type.
>> >
>> > Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
>> > use a scalar load and then insert it into a vector, whereas all we want
>> > (and all we currently generate) is a single vector load.
>> >
>> > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
>> > for both costing (vector load rather than scalar load and vector
>> > construct) and code-generation.
>> Looks to me it will be a special case for VMAT_ELEMENTWISE or
>> VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?
>
> The question is why we do
>
>    if (memory_access_type == VMAT_GATHER_SCATTER
>        || (!slp && memory_access_type == VMAT_CONTIGUOUS))
>      grouped_load = false;
>
> I think for the particular testcase removing this would fix the issue as
No, simply relaxing this check would result in generating redundant
loads (for gap).  Also it leads to vect_transform_grouped_load as well
as vect_permute_load_chain, so we would need to do the special case
handling just like classifying as VMAT_CONTIGUOUS_PERMUTE.

Thanks,
bin
> well?
>
> Richard.
>
>> Thanks,
>> bin
>> >
>> > Thanks,
>> > Richard

Reply via email to