------- Comment #1 from steven at gcc dot gnu dot org 2007-10-20 10:37 ------- The first issue with code hoisting is that it can only hoist lexically equivalent expressions. This means that dependent expressions that compute the same value can unfortunately not be hoisted in one pass. Example:
BB1 // r1, r2, and r3 are live if (...) goto BB2 else goto BB3 {succ BB2, BB3 } { pred BB1 } BB2 r4 <- r1 + r2 r5 <- r3 + r4 goto BB4 { succ BB4 } { pred BB1 } BB3 r6 <- r1 + r2 r7 <- r3 + r6 goto BB4 { succ BB4 } {pred BB2, BB3, ... } BB4 // etc. In the example above, GCC's gcse.c code hoisting implementation should be able to hoist the expression "r1 + r2" from BB2 and BB3 to the end of BB1. An assignment to, say, r8, is inserted into BB1, and r8 is substituted on the rhs of the assignments to r3 in BB2 and r6 in BB3. The expressions "r3 + r4" and "r3 + r6" compute the same value, but they are not lexically equivalent, so gcse.c can't hoist this expression. Thus, after one code hoisting pass, the code would look like this: BB1 // r1, r2, and r3 are live r8 <- r1 + r2 if (...) goto BB2 else goto BB3 {succ BB2, BB3 } { pred BB1 } BB2 r4 <- r8 r5 <- r3 + r4 goto BB4 { succ BB4 } { pred BB1 } BB3 r6 <- r8 r7 <- r3 + r6 goto BB4 { succ BB4 } {pred BB2, BB3, ... } BB4 // etc. Copy propagation will now propagate r8 into the assignments to r5 and r7, with the following result: BB1 // r1, r2, and r3 are live r8 <- r1 + r2 if (...) goto BB2 else goto BB3 {succ BB2, BB3 } { pred BB1 } BB2 r4 <- r8 r5 <- r3 + r8 goto BB4 { succ BB4 } { pred BB1 } BB3 r6 <- r8 r7 <- r3 + r8 goto BB4 { succ BB4 } {pred BB2, BB3, ... } BB4 // etc. A second code hoisting pass would now be able to hoist the expression "r3 + r8" from BB2 and BB3 into BB2. After dead code elimination and cfg cleanups, the optimal code would be: BB1 // r1, r2, and r3 are live r8 <- r1 + r2 r9 <- r3 + r8 goto BB4 {succ BB4 } {pred BB1, ... } BB4 // etc. GCC currently runs only one code hoisting pass (and then only when optimizing for code size), so GCC never would hoist the expression "r3 + r8". To fully optimize this example, more code hoisting passes are necessary, or an improved code hoisting implementation is necessary. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828