Re: Effect of alignment and peeling on vectorised loops

Michael Hope Wed, 30 Nov 2011 16:40:53 -0800

On Thu, Dec 1, 2011 at 12:20 AM, Ira Rosen <ira.ro...@linaro.org> wrote:
> On 30 November 2011 02:33, Michael Hope <michael.h...@linaro.org> wrote:
>
>> I then converted the vld1 and vst1 to specifiy an alignment of 64
>> bits. See:
>>  http://people.linaro.org/~michaelh/incoming/set-alignment.png
>>
>> This improved the throughput in all cases and in cases for more than 50
>> words by 14 %.  This graph also shows the overhead of the runtime
>> peeling check.  The blue line is the vectoriser version which is
>> slower to pick up due the greater per call overhead.
>
> So, the auto-vectorized code doesn't have the alignment hints (peeling
> or not peeling), right? Is this how a hint is supposed to look like:
> vld1.i64 {d16-d17}, [r1 :"#_128"] , or am I looking for a wrong thing?


I had a look in the backend and the vld1/vst1 %A operand adds the
alignment if known.  It correctly adds [r1:64] if I feed in an array
of int64s.  The code checks based on MEM_ALIGN and MEM_SIZE of the
operand:
        align = MEM_ALIGN (x) >> 3;
        memsize = INTVAL (MEM_SIZE (x));

Not sure why the backend generates a vldmia instead of a vld1 though.

-- Michael

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Effect of alignment and peeling on vectorised loops

Reply via email to