Hi,

On 18/02/16 01:51, Virendra Kumar Pathak wrote:
> Hi Toolchain Group,
> 
> I am trying to study the effect of loop buffer size on loop unrolling &
> the way gcc (aarch64) handles this.

It depends on the micro-architecture. Usually, loop buffer helps to hold
the loop completely and supplies the instruction fetch unit from there.
The main benefit used to be the dynamic energy reduction. i.e., you
don't access the main (L1) cache for the loop iterations.

Loop unrolling on the other hand can remove the control instructions and
allow the compiler to optimize across loop iterations.


> 
> To my understanding, Loop Buffer is like i-cache which contains
> pre-decoded instruction that can be re-used if branch instruction
> loopbacks to an instruction
> which is still present in the buffer. For example, in Intel’s Nehalem
> loop buffer size is 28 u-ops. In LLVM compiler, it seems
> LoopMicroOpBufferSize is for the same purpose.
> However, I could not find any parameter/variable inside config/aarch64
> representing loop buffer size. I am using Linaro gcc 5.2.1
> 
> [Question]
> 1. Is there any example inside aarch64 (or in general) which uses the
> loop buffer size in loop unrolling decision? If yes, could you please
> mention the relevant files or code section?

Look at this patch for x86:
https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02567.html

This is implemented using TARGET_LOOP_UNROLL_ADJUST as you have found out.

Thanks,
Kugan


> 2. Otherwise any guidance/input on adding this support in aarch64
> backend assuming architecture has the loop buffer support.
> 
> [My Experiments/Code Browsing]
> I have collected following information from code browsing. Please
> correct if I missed or misunderstood something.
> 
> TARGET_LOOP_UNROLL_ADJUST - This target hook return the number of times
> a loop can be unrolled.
> This can be used to handle the architecture constraint such number of
> memory references inside a loop e.g. ix86_loop_unroll_adjust() &
> s390_loop_unroll_adjust().
> On the same note, can this be used to handle loop buffer size too?
> 
> Without above hook, in loop-unroll.c parameters like
> PARAM_MAX_UNROLLED_INSNS (default 200), PARAM_MAX_AVERAGE_UNROLLED_INSNS
> (default 80) decides the unrolling factor. e.g. nunroll = PARAM_VALUE
> (PARAM_MAX_UNROLLED_INSNS) / loop->ninsns;
> 
> In config/aarch64.c, I found align_loops variable in
> aarch64_override_options_after_change() function.
> I guess this an alignment done before starting the loop header in the
> executable. This should not play any role in loop unrolling. Right?
> 
> So any guidance on how we can instruct aarch64 backend to utilize loop
> buffer size in deciding the loop unrolling factor?
> 
> Thanks in advance for your time.
> 
> -- 
> with regards,
> Virendra Kumar Pathak
> 
> 
> _______________________________________________
> linaro-toolchain mailing list
> linaro-toolchain@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-toolchain
> 
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to