[Bug middle-end/40887] GCC generates suboptimal code for indirect function calls on ARM

ramana at gcc dot gnu dot org Tue, 28 Jul 2009 01:29:59 -0700


------- Comment #2 from ramana at gcc dot gnu dot org  2009-07-28 08:29 -------
(In reply to comment #0)
> Consider the following code:
> 
> int (*indirect_func)();
> 
> int indirect_call()
> {
>     return indirect_func();
> }
> 
> gcc 4.4.0 generates the following with -O2 -mcpu=cortex-a8 -S:
> 
> indirect_call:
>     @ args = 0, pretend = 0, frame = 0
>     @ frame_needed = 0, uses_anonymous_args = 0
>     movw    r3, #:lower16:indirect_func
>     stmfd   sp!, {r4, lr}
>     movt    r3, #:upper16:indirect_func
>     mov     lr, pc
>     ldr     pc, [r3, #0]
>     ldmfd   sp!, {r4, pc}
> 
> The problem is that the instruction "ldr pc, [r3, #0]" is not considered a
> function call by the Cortex-A8's branch predictor, as noted in DDI0344J 
> section
> 5.2.1, Return stack predictions. Thus, the return from the called function is
> mispredicted resulting in a penalty of 13 cycles compared to a direct call
> 
> Rather than doing
> mov lr, pc
> ldr pc, [r3]
> it should instead use the blx instruction as so:
> ldr lr, [r3]
> blx lr
> which is considered a function call by the branch predictor, and has an
> overhead of only one cycle compared to a direct call.


The point made is correct but there is something you've missed in your patch !
loading lr with the address of the function you want to call, destroys the
return address ,- so your code is never going to return ! 

Instead you want -

ldr r3,[r3]
blx r3

Or better still bx r3 but that is PR19599 :)






> 
> gcc -v:
> Using built-in specs.
> Target: arm-none-linux-gnueabi
> Configured with: ../gcc-4.4.0/configure --target=arm-none-linux-gnueabi
> --prefix=/usr/local/arm --enable-threads
> --with-sysroot=/usr/local/arm/arm-none-linux-gnueabi/libc
> Thread model: posix
> gcc version 4.4.0 (GCC)
> 


-- 

ramana at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-07-28 08:29:41
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40887

[Bug middle-end/40887] GCC generates suboptimal code for indirect function calls on ARM

Reply via email to