In one of my recent messages about a patch to the LTO branch, I
mentioned that we could compile and successfully run all of the C
SPECint benchmarks except 176.gcc.  Chris Lattner asked if I had done
any benchmarking now that real programs could be run; I said that I
hadn't but would try to do some soon.  This is the result of that.

I don't have numbers on what compile times look like, but I don't think
they're good.  176.gcc takes several minutes to compile (basically -flto
*.o, not counting the time to compile individual .o files); the other
benchmarks are all a minute or more apiece.

Executive summary: LTO is currently *not* a win.

In the table below, runtimes are in seconds.  I ran the tests on an
8-core 1.6GHz machine with 8 GB RAM.  I believe the machine was
relatively idle; I ran the tests over a weekend evening.  The last merge
from mainline to the LTO branch was mainline r130155, so that's about
what the -O2 numbers correspond to--I don't think we've changed too much
core code on the branch.  The % change are just in-my-head estimates,
using -O2 as a baseline.

                -O2     -flto   % change
164.gzip        174     176     + 1
175.vpr         139     143     + 3
181.mcf         162     166     + 3
186.crafty      65.2    66.6    + < 1
197.parser      240     261     + 9
253.perlbmk     119     133     + 13
254.gap         84.4    87      + 4
256.bzip2       131     145     + 11
300.twolf       202     193     - 4 (!)

176.gcc doesn't run correctly with LTO yet; 255.vortex didn't run
correctly with "mainline", but it did with -flto, which is curious.  We
don't do C++ yet, so 252.eon is not included.

In general, things get worse with LTO, sometimes much worse.  I can
think of at least three possible reasons off the top of my head:

- Alias information.  We don't have any type-based alias information in
  -flto, which hurts.

- We don't merge types between compilation units, which could account
  for poor optimization behavior.

- I believe we lose some information in the LTO write/read process; edge
  probabilities, estimated # instructions in functions, etc. get lost.
  This hurts inlining decisions, block layout, alignment of jump
  targets, etc.  So there's information we need to write out or
  recompute.

-Nathan

Reply via email to