Re: [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular

Andrew Pinski Mon, 30 Jan 2017 08:39:52 -0800

On Mon, Jan 30, 2017 at 3:24 AM, Maxim Kuvyrkov
<maxim.kuvyr...@linaro.org> wrote:
> This patch series improves -fprefetch-loop-arrays pass through small fixes 
> and tweaks, and then enables it for several AArch64 cores.
>
> My tunings were done on and for Qualcomm hardware, with results varying 
> between +0.5-1.9% for SPEC2006 INT and +0.25%-1.0% for SPEC2006 FP at -O3, 
> depending on hardware revision.
>
> This patch series enables restricted -fprefetch-loop-arrays at -O2, which 
> also improves SPEC2006 numbers
>
> Biggest progressions are on 419.mcf and 437.leslie3d, with no serious 
> regressions on other benchmarks.
>
> I'm now investigating making -fprefetch-loop-arrays more aggressive for 
> Qualcomm hardware, which improves performance on most benchmarks, but also 
> causes big regressions on 454.calculix and 462.libquantum.  If I can fix 
> these two regressions, prefetching will give another boost to AArch64.


I have a patch which causes more aggressively already which improves
libquantum for CN88xx; I have not submitted yet as I had just
restarted the upstreaming my patch sets.

Thanks,
Andrew

>
> Andrew just posted similar prefetching tunings for Cavium's cores, and the 
> two patches have trivial conflicts.  I'll post mine as-is, since it address 
> one of the comments on Andrew's review (adding a stand-alone struct for 
> tuning parameters).
>
> Andrew, feel free to just copy-paste it to your patch, since it is just a 
> mechanical change.
>
> All patches were bootstrapped and regtested on x86_64-linux-gnu and 
> aarch64-linux-gnu.
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>

Re: [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular

Reply via email to