https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79534
--- Comment #8 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
In the case before Honza's patch, corrupt profile information leads to a branch
being marked as 100% taken. After Honza's patch, the branch is instead seen
with 95.6% taken:
(jump_insn 1916 1915 1922 309 (set (pc)
(if_then_else (ne (reg:CC 66 cc)
(const_int 0 [0]))
(label_ref 1905)
(pc))) "foo.cpp":59 9 {condjump}
(expr_list:REG_DEAD (reg:CC 66 cc)
(int_list:REG_BR_PROB 10000 (nil)))
-> 1905)
;; succ: 227 [95.6%]
;; 226 [4.4%] (FALLTHRU)
That's enough for GCC to consider the branch unpredictable, which in turn
causes GCC to use the "unpredictable" number for BRANCH_COST when setting the
maximum , which when tuning for Cortex-A57 is 1 for predictable branches (not
high enough to trigger the transform) and 3 for unpredictable branches (high
enough to trigger the transform). That explains why we don't see the
performance difference for -mcpu=generic, where BRANCH_COST always returns 2 -
which is always high enough to trigger this if-conversion.
The cost model looks reasonable, this is clearly a borderline case for the
heuristic. The only thing I found surprising in my analysis of this regression
is that GCC considers a 95.6% taken branch as unpredictable.
I'm not sure what the correct course for fixing this is - nothing in the
compiler seems to be broken, we're just on an unlucky side of the static
prediction engine and the ifcvt heuristics.