[Bug tree-optimization/79534] [7 Regression] tree-ifcombine aarch64 performance regression with trunk@245151

jgreenhalgh at gcc dot gnu.org Wed, 19 Apr 2017 11:22:27 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79534


--- Comment #8 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
In the case before Honza's patch, corrupt profile information leads to a branch
being marked as 100% taken. After Honza's patch, the branch is instead seen
with 95.6% taken:

(jump_insn 1916 1915 1922 309 (set (pc)
        (if_then_else (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (label_ref 1905)
            (pc))) "foo.cpp":59 9 {condjump}
     (expr_list:REG_DEAD (reg:CC 66 cc)
        (int_list:REG_BR_PROB 10000 (nil)))
 -> 1905)
;;  succ:       227 [95.6%] 
;;              226 [4.4%]  (FALLTHRU)

That's enough for GCC to consider the branch unpredictable, which in turn
causes GCC to use the "unpredictable" number for BRANCH_COST when setting the
maximum , which when tuning for Cortex-A57 is 1 for predictable branches (not
high enough to trigger the transform) and 3 for unpredictable branches (high
enough to trigger the transform). That explains why we don't see the
performance difference for -mcpu=generic, where BRANCH_COST always returns 2 -
which is always high enough to trigger this if-conversion.

The cost model looks reasonable, this is clearly a borderline case for the
heuristic. The only thing I found surprising in my analysis of this regression
is that GCC considers a 95.6% taken branch as unpredictable.

I'm not sure what the correct course for fixing this is - nothing in the
compiler seems to be broken, we're just on an unlucky side of the static
prediction engine and the ifcvt heuristics.

[Bug tree-optimization/79534] [7 Regression] tree-ifcombine aarch64 performance regression with trunk@245151

Reply via email to