On 28/08/16 23:16, Fredrik Hederstierna wrote:
> Hi,
> 
> I from time to time get the impression that the inter procedure scratch 
> register r12 (ip) is not used as often as it might on ARM.
> 
> Example, compiled with GCC-6.2 for arm966e-s ARM with arm-none-eabi-gcc 
> target:
> 
> struct data {
>   int flags;
> };
> 
> extern void* func(struct data* dp);
> 
> struct data* test(struct data* dp)
> {
>   int saved_flags = dp->flags;
>   struct data *dp2 = func(dp);
>   dp->flags = saved_flags;
>   return dp2;
> }
> 
> 
> Small simple function that compiles to (using GCC-6.2 with either -Os or -O2)
> 
> 00000000 <test>:
>    0:   e92d4070        push    {r4, r5, r6, lr}
>    4:   e1a04000        mov     r4, r0
>    8:   e5905000        ldr     r5, [r0]
>    c:   ebfffffe        bl      0 <func>
>   10:   e5845000        str     r5, [r4]
>   14:   e8bd8070        pop     {r4, r5, r6, pc}
> 
> 
> This short example where a function calls another function, and saves one 
> value in structure, that needs to be restored.
> 
> I guess its in ABI to keep stack 64bit aligned, but still code won't get 
> optimal,
> But instead of pushing stuff to stack, the r12 scratch could be used in some 
> cases.
> 
> Couldn't this be compiled to as the following, with using r12 'ip':
> 
> 00000000 <test>:
>    0:   xxxxxxxx        push    {r4, lr}
>    4:   xxxxxxxx        mov     r4, r0
>    8:   xxxxxxxx        ldr     ip, [r0]
>    c:   xxxxxxxx        bl      0 <func>
>   10:   xxxxxxxx        str     ip, [r4]
>   14:   xxxxxxxx        pop     {r4, pc}
> 
No.  IP cna be clobbered either by func itself or any inter-procedural
veneer that might be generated by the linker.  You'd need to prove that
neither could happen before IP could be used to hold a value over a
function call.

R.

> Still stack is 64bit aligned, though its not less instructions,
> but code should faster since 2 less loads and 2 less stores to (possibly 
> external) memories.
> 
> I know high-registers r8-r12 is not preferable always with thumb1 or thumb2,
> but for ARM the penalty is less I think and maybe ip could be used more often?
> 
> How is cost calculated for ip on ARM, it should in some sense be rather 
> 'cheap' since you dont have to push it to stack for inter procedure calls?
> 
> Thanks, and Best Regards,
> Fredrik
> 

Reply via email to