4.2.0 -O3 compile takes two times longer earlier versions

steven at gcc dot gnu dot org Tue, 10 Jan 2006 14:39:25 -0800


------- Comment #24 from steven at gcc dot gnu dot org  2006-01-10 22:39 -------
Realistically, the prospects are that this problem won't be fixed until compile
time gets on the GCC developers' radar for real.  The next release always
promises  to be faster, but usually turns out to be disappointingly slow in the
end.


Note that in your case, there really is only one real bug.  The other slow
passes are inherently slow for very large functions.  The top 3 slow passes of
your latest time report was:

 tree SSA incremental  :   3.19 (12%) usr
 scheduling 2          :   2.03 ( 8%) usr
 scheduling            :   1.52 ( 6%) usr

The tree SSA incremental bit is non-linear behavior when computing dominance
frontiers (inevitable) plus non-linear behavior due to massive bitmap abuse. 
This incremental update tries to work on portions of the CFG, but in practice
it almost always works on the whole function, so you end up doing far more work
than strictly necessary.  What's required to fix this is a region based
compilation model such that the SSA updater can work on dirty regions (i.e.
regions where an update is needed) without worrying about the other regions.  I
don't think this will be fixed any time soon :-(

As for the scheduler, well, list scheduling is just a quadratic algorithm:
O(n_insns_to_schedule^2) where n_insns_to_schedule is the number of insns in
the region or trace that you're scheduling.  This is one of the reasons why no
compiler schedules whole functions at once: You have to split them up to keep
compile times reasonable.  The fact that scheduling got slower for IA-64 has
two reasons.  First, basic blocks are typically larger for IA-64 in GCC 4.x
than in GCC 3.x, and typically GCC 4.x can find longer traces than GCC 3.x, so
your n_insns_to_schedule is larger (I measured this on ia64-linux).  Second,
the scheduler model for IA-64 is just incredibly complicated, and a significant
amount of time is simply lost there because the IA-64 automaton is _huge_.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23835

[Bug tree-optimization/23835] [4.1/4.2 Regression] case where gcc 4.1/4.2.0 -O3 compile takes two times longer earlier versions

Reply via email to