3.4 regression] pessimization of "goto *"

anton at mips dot complang dot tuwien dot ac dot at Sat, 12 Mar 2005 13:38:52 -0800

------- Additional Comments From anton at mips dot complang dot tuwien dot ac 
dot at  2005-03-12 21:38 -------
Subject: Re:  [3.3/3.4 regression] pessimization of "goto *"

steven at gcc dot gnu dot org wrote:
> 
> 
> ------- Additional Comments From steven at gcc dot gnu dot org  2005-03-10 
> 12:48 -------
> > Maybe there should be another combining pass after the duplication
> > of the indirect jumps.  Should I create another PR for this?
> 
> There should not be another "combining" pass (you really mean constant
> propagation).

I meant "Instruction combination (`combine.c')".  Not sure if this is
replaced by something else in the recent gccs.  Why do you think I
mean constant propagation?

>  This new unfactoring stuff runs after register allocation,
> so such a pass would not really help, except maybe to make the code look
> prettier to you.

Ouch.  No way to fix that?  That's the cost we wanted to avoid.

> But, is this:
> 
>         mov    0xfffffffc(%ebx),%eax
>         jmp    *%eax
> 
> slower than this:
> 
>         jmp    *0xfffffffc(%ebx)
> 
> or have you not tried that (e.g. by hacking the assembly by hand)?

Ok, I hacked the assembly by hand, and this is what I got:

All numbers are user times in seconds for gforth-fast-0.6.2:

Pentium-4 2.26 GHz (i386 code):
           no-dynamic  no-super   dynamic    
combined?  yes  no     yes  no    yes  no    
siev       0.47 0.49   0.36 0.36  0.33 0.33  
bubble     0.81 0.81   0.52 0.53  0.47 0.47  
matrix     1.03 1.01   0.30 0.30  0.36 0.35  
fib        0.70 0.68   0.75 0.60  0.53 0.58  

Opteron 2GHz (i386 code):
           no-dynamic  no-super   dynamic    
combined?  yes  no     yes  no    yes  no    
siev       0.46 0.47   0.37 0.36  0.33 0.32
bubble     0.73 0.74   0.50 0.51  0.50 0.51
matrix     0.93 0.95   0.35 0.34  0.31 0.32
fib        0.63 0.64   0.49 0.50  0.44 0.45

"No-super" performs the same number of indirect branches (and anything
else) as "no-dynamic", but has better branch prediction.  "Dynamic" is
like "no-super", but eliminates many of the indirect branches.

So, overall the instruction combination alone does not make much of a
difference on these CPUs.

- anton

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15242

[Bug rtl-optimization/15242] [3.3/3.4 regression] pessimization of "goto *"

Reply via email to