On Thu, Sep 14, 2017 at 6:30 PM, Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org> wrote: > This patch prevent tree unroller from completely unrolling inner loops if that > results in excessive strided-loads in outer loop.
Same comments from the RTL version. Though one more comment here: + if (!INDIRECT_REF_P (op) + && TREE_CODE (op) != MEM_REF + && TREE_CODE (op) != TARGET_MEM_REF) + continue; This does not handle ARRAY_REF which might be/should be handled. + if ((loop_father = loop_outer (loop))) Since you don't use loop_father outside of the if statement use the following (allowed) format if (struct loop *loop_father = loop_outer (loop)) Thinking about this more, hw_prefetchers_avail might not be equivalent to num_slots (PARAM_SIMULTANEOUS_PREFETCHES) but the name does not fit what it means if I understand your hardware correctly. Maybe hw_load_non_cacheline_prefetcher_avail since if I understand the micro-arch is that the prefetchers are not based on the cacheline being loaded. Thanks, Andrew > > Thanks, > Kugan > > gcc/ChangeLog: > > 2017-09-12 Kugan Vivekanandarajah <kug...@linaro.org> > > * config/aarch64/aarch64.c (count_mem_load_streams): New. > (aarch64_ok_to_unroll): New. > * doc/tm.texi (ok_to_unroll): Define new target hook. > * doc/tm.texi.in (ok_to_unroll): Likewise. > * target.def (ok_to_unroll): Likewise. > * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use > ok_to_unroll while unrolling.