This is the gcc-4.0 reincarnation of PR 15242.  Since about gcc-3.2 or
so gcc tends to compile "goto *" into direct jumps to a shared
indirect jump.  Gcc-4.0 tries to undo this in a later stage, but
apparently it is not completely successful:

This is a fragment from the engine1.i file I will
attach (with some newlines removed:

H_IALOAD: __asm__(""); I_IALOAD:
{
java_arrayheader * aArray;
s4 iIndex;
Cell vResult;
;
((aArray) = (java_arrayheader * )(sp[1]));
((iIndex)=(s4)(Cell)(spTOS));

sp += 1;
{
# 188 "./java.vmg"
{
  { if ((aArray) == ((void *)0)) { goto *throw_nullpointerexception; } };
  { if (( ((java_arrayheader*)(aArray))->size ) <= (u4) (iIndex)) {
arrayindexoutofbounds_index = (iIndex); goto
*throw_arrayindexoutofboundsexception; } };
  ;
  vResult = ((((java_intarray*)(aArray))->data)[iIndex]);
}
# 332 "java-vm.i"
}

;
((spTOS) = (Cell)(vResult));
J_IALOAD: __asm__("");
do {ca=*(ip++);} while(0);
K_IALOAD: __asm__("");
goto before_goto;
}

After compiling this with "gcc-4.0.0 -fno-reorder-blocks -O2 -g3 -S
engine1.i", the assembly output for this fragment is:

.L995:
        jmp     *%rdx
...
.L9:
.LBB46:
        .loc 115 314 0
        addq    $8, %r15
        .loc 115 315 0
        movl    -168(%rbp), %eax
        .loc 114 189 0
        movq    -136(%rbp), %rdx
        .loc 115 314 0
        movq    (%r15), %r9
        .loc 114 189 0
        testq   %r9, %r9
        je      .L995
        .loc 114 190 0
        cmpl    %eax, 16(%r9)
        ja      .L764
        movq    -152(%rbp), %rdx
        movl    %eax, -116(%rbp)
.LBE46:
        .loc 2 231 0
        jmp     *%rdx
.L764:
.LBB47:
        .loc 114 192 0
        cltq
        movslq  24(%r9,%rax,4),%r9
        movq    %r9, -168(%rbp)
.L195:
        .loc 115 342 0
        .loc 115 343 0
        movq    (%r14), %r13
        addq    $8, %r14
.L381:
        .loc 115 344 0
        jmp     .L560


So while gcc managed to reconstruct the second indirect jump, it did
not succeed for the first "goto *", which is pessimised into a
conditional branch to a shared indirect jump.

Code coming from "gcc version 4.0.2 (Debian 4.0.2-2)" or gcc-4.0.0
without -fno-reorder-blocks is similar.

The impact of this pessimisation is that we cannot use "selective
inlining" for JVM instructions that can throw exceptions, like
"getfield"; a rough guess at the resulting slowdown for the Cacao JVM
interpreter is a factor 1.2.

Hmm, I guess that the intermediate direct unconditional jump is
optimised away, and that leads to the inability to reconstruct the
indirect jump.  Maybe I can work around this problem by putting an
__asm__("") or a label between the if and the "goto *" to prevent the
optimisation, but:

1) Will the workaround still work with the next gcc?

2) The workarounds start to accumulate.


-- 
           Summary: pessimization of goto * ("computed goto")
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: anton at mips dot complang dot tuwien dot ac dot at
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25285

Reply via email to