------- Comment #16 from ubizjak at gmail dot com 2010-01-06 15:17 ------- The problem turns out to be quite complex interaction between cse1, cprop3 and cse2 pass.
Let's start with this RTL dump for from: tree-ssa-loop-im.i.148r.subreg1 ;; Function determine_max_movement (determine_max_movement) ... L34: 35 NOTE_INSN_BASIC_BLOCK 36 [r73:DI]=r69:DI 37 r155:SI=[r152:DI] 38 r154:QI#0=zero_extract(r155:SI#0,0x8,0x0) 39 r78:DI=zero_extend(r154:QI) REG_EQUAL: zero_extend([r152:DI]) 40 pc={(r78:DI==0x0)?L579:pc} REG_BR_PROB: 0x1388 ... L571: 572 NOTE_INSN_BASIC_BLOCK 573 r351:SI=[r152:DI] 574 r350:QI#0=zero_extract(r351:SI#0,0x8,0x0) 575 r82:DI=zero_extend(r350:QI) REG_EQUAL: zero_extend([r152:DI]) L579: 580 NOTE_INSN_BASIC_BLOCK 581 r353:SI=[r152:DI] 582 r352:QI#0=zero_extract(r353:SI#0,0x8,0x0) 583 r82:DI=zero_extend(r352:QI) REG_EQUAL: zero_extend([r152:DI]) Please note REG_EQUALS in (insn 39) and (insn 583). Next, cse1 does its job and figures that flow jumps to L579 only when [r152:DI] is zero. This zero is also propagated to r82 in (insn 583): tree-ssa-loop-im.i.150r.cse1: ;; Function determine_max_movement (determine_max_movement) ... L34: 35 NOTE_INSN_BASIC_BLOCK 36 [r73:DI]=r69:DI 37 r155:SI=[r152:DI] 38 r154:QI#0=zero_extract(r155:SI#0,0x8,0x0) 39 r78:DI=zero_extend(r154:QI) REG_EQUAL: zero_extend([r152:DI]) 40 pc={(r78:DI==0x0)?L579:pc} REG_BR_PROB: 0x1388 ... L571: 572 NOTE_INSN_BASIC_BLOCK 573 r351:SI=r155:SI 574 r350:QI#0=zero_extract(r155:SI#0,0x8,0x0) 575 r82:DI=r78:DI REG_EQUAL: zero_extend([r152:DI]) L579: 580 NOTE_INSN_BASIC_BLOCK 581 r353:SI=r155:SI 582 r352:QI#0=zero_extract(r155:SI#0,0x8,0x0) 583 r82:DI=0x0 REG_EQUAL: zero_extend([r152:DI]) After all passes, we find following in cprop3 dump: tree-ssa-loop-im.i.168r.cprop3: ;; Function determine_max_movement (determine_max_movement) ... L34: 35 NOTE_INSN_BASIC_BLOCK 36 [r73:DI]=r69:DI REG_DEAD: r69:DI 37 r355:SI=[r152:DI] 38 r154:QI#0=zero_extract(r355:SI#0,0x8,0x0) 39 r78:DI=zero_extend(r154:QI) REG_DEAD: r154:QI REG_EQUAL: zero_extend([r152:DI]) 583 r82:DI=0x0 REG_EQUAL: zero_extend([r152:DI]) 40 pc={(r78:DI==0x0)?L230:pc} REG_BR_PROB: 0x1388 ... L230: 231 NOTE_INSN_BASIC_BLOCK 232 r104:DI=sign_extend([r73:DI+0x18]) 233 r207:DI=r82:DI==0x1 633 r356:DI=leu(r82:DI,0x5) 234 pc={(r207:DI==0x0)?L241:pc} REG_DEAD: r207:DI REG_BR_PROB: 0x1ae8 235 NOTE_INSN_BASIC_BLOCK 236 r209:DI=`compiler_params' 237 r208:DI=[r209:DI] REG_DEAD: r209:DI REG_EQUAL: [`compiler_params'] 238 r107:DI=sign_extend([r208:DI+0x748]) REG_DEAD: r208:DI ... since both, r78 and r82 equal to the same location, cse2 wisely determines that both are equal to zero and removes all blocks from the conditional jump onward. Things go down the drain from here. tree-ssa-loop-im.i.169r.cse2: ... L34: 35 NOTE_INSN_BASIC_BLOCK 36 [r73:DI]=r69:DI REG_DEAD: r69:DI 37 r355:SI=[r152:DI] 38 r154:QI#0=zero_extract(r355:SI#0,0x8,0x0) 583 r82:DI=0x0 REG_EQUAL: zero_extend([r152:DI]) 232 r104:DI=sign_extend([r73:DI+0x18]) 233 r207:DI=r82:DI==0x1 633 r356:DI=leu(r82:DI,0x5) 234 pc={(r207:DI==0x0)?L241:pc} REG_DEAD: r207:DI REG_BR_PROB: 0x1ae8 235 NOTE_INSN_BASIC_BLOCK 236 r209:DI=`compiler_params' 237 r208:DI=[r209:DI] REG_DEAD: r209:DI REG_EQUAL: [`compiler_params'] 238 r107:DI=sign_extend([r208:DI+0x748]) REG_DEAD: r208:DI ... So, it looks to me, that when gcc figures a constant in one arm of IF expression, it should either: a) remove REG_EQUAL expr when constant is propagated, since this constant depends on the location of the insn b) remove REG_EQUAL when insn is hoisted, for the same reason. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42511