Hi, this patch fixes false incosistent profile error message seen when building SPEC with -fprofile-use -fdump-ipa-profile. The problem is that with dumping tree_esitmate_probability is run in dry run mode to report success rates of heuristics. It however runs determine_unlikely_bbs which ovewrites some counts to profile_count::zero and later value profiling sees the mismatch.
In sane profile determine_unlikely_bbs should be almost always no-op since it should only drop to 0 things that are known to be unlikely executed. What happens here is that there is a comdat where profile is lost and we see a call with non-zero count calling function with zero count and "fix" the profile by making the call to have zero count, too. I also extended unlikely prediates to avoid tampering with predictions when prediciton is believed to be reliable. This also avoids us from dropping all EH regions to 0 count as tested by the testcase. Jeff, I think we can use same logic as in isolate-paths and detect statements which will trigger undefined effect or errornerous termination in unlikely_executed_stmt_p. Are you OK with exporting stmt_uses_0_or_null_in_undefined_way and using it in unlikely_executed_stmt_p? Do we have other ways to detect that given STMT will very likely not be executed? Bootsrapped/regtested x86_64-linux, will commit it shortly. gcc/ChangeLog: * predict.cc (unlikely_executed_edge_p): Ignore EDGE_EH if profile is reliable. (unlikely_executed_stmt_p): special case builtin_trap/unreachable and ignore other heuristics for reliable profiles. (tree_estimate_probability): Disable unlikely bb detection when doing dry run gcc/testsuite/ChangeLog: * g++.dg/tree-prof/eh1.C: New test. diff --git a/gcc/predict.cc b/gcc/predict.cc index ef31c48bfe2..eefff5908f9 100644 --- a/gcc/predict.cc +++ b/gcc/predict.cc @@ -245,7 +245,10 @@ unlikely_executed_edge_p (edge e) { return (e->src->count == profile_count::zero () || e->probability == profile_probability::never ()) - || (e->flags & (EDGE_EH | EDGE_FAKE)); + || (e->flags & EDGE_FAKE) + /* If we read profile and know EH edge is executed, trust it. + Otherwise we consider EH edges never executed. */ + || ((e->flags & EDGE_EH) && !e->probability.reliable_p ()); } /* Return true if edge E of function FUN is probably never executed. */ @@ -830,6 +833,26 @@ unlikely_executed_stmt_p (gimple *stmt) { if (!is_gimple_call (stmt)) return false; + + /* Those calls are inserted by optimizers when code is known to be + unreachable or undefined. */ + if (gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE) + || gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE_TRAP) + || gimple_call_builtin_p (stmt, BUILT_IN_TRAP)) + return false; + + /* Checks below do not need to be fully reliable. Cold attribute may be + misplaced by user and in the presence of comdat we may result in call to + function with 0 profile having non-zero profile. + + We later detect that profile is lost and will drop the profile of the + comdat. + + So if we think profile count is reliable, do not try to apply these + heuristics. */ + if (gimple_bb (stmt)->count.reliable_p () + && gimple_bb (stmt)->count.nonzero_p ()) + return gimple_bb (stmt)->count == profile_count::zero (); /* NORETURN attribute alone is not strong enough: exit() may be quite likely executed once during program run. */ if (gimple_call_fntype (stmt) @@ -3269,7 +3292,8 @@ tree_estimate_probability (bool dry_run) calculate_dominance_info (CDI_POST_DOMINATORS); /* Decide which edges are known to be unlikely. This improves later branch prediction. */ - determine_unlikely_bbs (); + if (!dry_run) + determine_unlikely_bbs (); bb_predictions = new hash_map<const_basic_block, edge_prediction *>; ssa_expected_value = new hash_map<int_hash<unsigned, 0>, expected_value>; diff --git a/gcc/testsuite/g++.dg/tree-prof/eh1.C b/gcc/testsuite/g++.dg/tree-prof/eh1.C new file mode 100644 index 00000000000..10a35968dc4 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-prof/eh1.C @@ -0,0 +1,34 @@ +/* { dg-options "-O3 -fdump-ipa-profile-details -fno-inline -fdump-tree-fixup_cfg3-details -fdump-tree-optimized-details" } */ +char a[10000]; +char b[10000]; +int sz = 1000; + +__attribute__((noipa)) + void test2 () +{ + throw (sz); +} +void +test () +{ + try + { + test2 (); + } + catch (int v) + { + __builtin_memcpy (b, a, v); + } +} +int +main () +{ + for (int i = 0; i < 100000; i++) + test (); +} +/* { dg-final-use-not-autofdo { scan-ipa-dump-times "Average value sum:100000000" 2 "profile" } } */ +/* 1 zero count for resx block. */ +/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 1 "fixup_cfg3" } } */ +/* 2 zero count for resx block and return block since return gets duplicated by tracer. */ +/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 2 "optimized" } } */ +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Average value sum:100000000" 1 "optimized" } } */