https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> --- Would be still good to get rid of. Not sure why we key off in_gimple_form either. What we get is branches done in different directions and thus BB reorder entered with a different BB order. Edge frequencies seem to match but BB frequencies are off, somehow differently scaled. -;; basic block 21, loop depth 2, count 0, freq 655, maybe hot -;; Invalid sum of incoming frequencies 985, should be 655 +;; basic block 21, loop depth 2, count 0, freq 3, maybe hot ... ;; basic block 23, loop depth 2, count 0, freq 720, maybe hot +;; basic block 23, loop depth 2, count 0, freq 2, maybe hot etc. (- is C + is C++) very few edge frequency differences, but those on backedges for example: ;; pred: 25 [100.0%] (FALLTHRU,EXECUTABLE) -;; 26 [91.0%] (TRUE_VALUE,EXECUTABLE) - # ivtmp.72_13 = PHI <0(25), ivtmp.72_14(26)> - # ivtmp.75_226 = PHI <0(25), ivtmp.75_192(26)> ... +;; 26 [80.0%] (FALSE_VALUE,EXECUTABLE) + # ivtmp.73_13 = PHI <0(25), ivtmp.73_14(26)> + # ivtmp.76_226 = PHI <0(25), ivtmp.76_192(26)> it's already that at profile-estimate time (for LU_factor) which also has 2 more basic blocks and 3 more edges for C++ (with the folding thing "fixed"). Didn't check performance with the folding fixed.