https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66573
Martin Sebor <msebor at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|c++ |rtl-optimization
--- Comment #3 from Martin Sebor <msebor at gcc dot gnu.org> ---
Since this isn't C++ specific but rather a doing of the RTL optimizer I changed
the component to rtl-optimization.
On powerpc64, which historically has implemented the same static branch
prediction strategy, and where IBM XLC emits the same code as Clang, GCC emits
the following at -O1:
foo:
...
cmpdi 7,3,0
beq 7,.L2
bl bar1
...
b .L1
.L2:
bl bar2
...
.L1:
...
blr
while the following at -O2:
foo:
...
cmpdi 7,3,0
...
bne 7,.L6
bl bar2
...
blr
.L6:
bl bar1
...
blr
Comparing the RTL dumps between -O1 and -O2 it looks like the change is
introduced in the basic block reordering (bbro) pass that only runs at -O2.
There, the if_then_else eq instruction
(jump_insn 7 6 8 2 (set (pc)
(if_then_else (eq (reg:CC 75 7 [156])
(const_int 0 [0]))
(label_ref 12)
(pc))) t.c:5 693 {*rs6000.md:12637}
(expr_list:REG_DEAD (reg:CC 75 7 [156])
(int_list:REG_BR_PROB 6100 (nil)))
-> 12)
...
(call_insn 14 13 17 3 (parallel [
(call (mem:SI (symbol_ref:DI ("bar2") [flags 0x41] <function_decl
0x3fff84018c10 bar2>) [0 bar2 S4 A8])
is replaced with
(jump_insn 7 6 13 2 (set (pc)
(if_then_else (ne (reg:CC 75 7 [156])
(const_int 0 [0]))
(label_ref:DI 53)
(pc))) t.c:5 693 {*rs6000.md:12637}
(expr_list:REG_DEAD (reg:CC 75 7 [156])
(int_list:REG_BR_PROB 3900 (nil)))
-> 53)
...
(call_insn 14 13 34 3 (parallel [
(call (mem:SI (symbol_ref:DI ("bar2") [flags 0x41] <function_decl
0x3fff84018c10 bar2>) [0 bar2 S4 A8])
I would expect this to then be corrected if necessary according to the
processor's static branch prediction strategy but it clearly doesn't happen for
powerpc64 or apparently x86_64.