https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115506
--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
For the original testcase tree optimizers optimize to:
<bb 4> [local count: 114863530]:
_30 = _2 & 240;
if (_30 == 224)
goto <bb 7>; [34.00%]
else
goto <bb 5>; [66.00%]
<bb 5> [local count: 75809929]:
if (_30 <= 223)
goto <bb 6>; [50.00%]
else
goto <bb 7>; [50.00%]
and for /* NOTE 1 */ workaround:
<bb 4> [local count: 114863530]:
_30 = _2 & 240;
if (_30 == 224)
goto <bb 7>; [34.00%]
else
goto <bb 5>; [66.00%]
<bb 5> [local count: 75809929]:
if (_30 <= 224)
goto <bb 6>; [50.00%]
else
goto <bb 7>; [50.00%]
If the tree optimizer didn't over-optimize the original case and left:
<bb 5> [local count: 75809929]:
if (_30 < 224)
goto <bb 6>; [50.00%]
else
goto <bb 7>; [50.00%]
then RTL CSE2 pass would be able to merge:
(insn 31 30 32 4 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:QI 111 [ _30 ])
(const_int -32 [0xffffffffffffffe0]))) "pr115506.c":11:8 9
{*cmpqi_1}
(nil))
and
(insn 36 33 37 5 (set (reg:CC 17 flags)
(compare:CC (reg:QI 111 [ _30 ])
(const_int -33 [0xffffffffffffffdf]))) "pr115506.c":14:15 9
{*cmpqi_1}
(expr_list:REG_DEAD (reg:QI 111 [ _30 ])
(nil)))
Is there a way to avoid the over-optimization with tree optimizers? RTL part
has no way to update the flags user during CSE2 pass:
(jump_insn 37 36 38 5 (set (pc)
(if_then_else (gtu (reg:CC 17 flags)
(const_int 0 [0]))
(label_ref:DI 90)
(pc))) "pr115506.c":14:15 1130 {*jcc}
(expr_list:REG_DEAD (reg:CC 17 flags)
(int_list:REG_BR_PROB 536870916 (nil)))