During Linux kernel development we ran into a few situations that showed
that indirect calls (through a function pointer) are significant slower on IA64
than on other platforms. Various ugly workarounds have been added to work
around that.

Some investigation shows the code gcc generates for indirect calls on ia64
isn't very good.

The IA64 optimization manuals recommend to load branch registers as early
as possible before a indirect jump, so that the CPU can start fetching
the code stream at the target. Otherwise there is a longer stall.

I ran some statistics over a 2.6.19 linux kernel with a recent 4.3 snapshot 
by grepping for indirect calls and in near all cases i looked at the branch
register was loaded in the bundle directly preceding the bundle that contains
the jump.  Earlier versions (4.1 and 4.0) also weren't any better.

>From looking at code in many cases it would have been
possible to load the branch register earlier since there was no
conditional state.

This is a enhancement request to change the scheduler to be more aggressive
at moving branch register loads earlier before jumps on ia64.


-- 
           Summary: Branch registers loaded too late on ia64
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: ak at muc dot de
GCC target triplet: ia64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688

Reply via email to