Hi,
On 21 June 2017 at 14:50, Martin Liška <mli...@suse.cz> wrote: > On 06/21/2017 10:26 AM, Jan Hubicka wrote: >>> >>> Ok, so I fixed that in the described way. There's one remaining fallout of: >>> gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c >>> >>> Where a fnsplit is properly done, but then it's again inlined: >>> >>> Considering split_me.part.0/5 with 23 size >>> to be inlined into test/2 in unknown:0 >>> Estimated badness is -0.000001, frequency 0.33. >>> Inlined split_me.part.0 into test which now has time 50.300000 and size >>> 44, net change of +17. >>> >>> Considering split_me.part.0/5 with 23 size >>> to be inlined into test/2 in unknown:0 >>> Estimated badness is -0.000001, frequency 0.33. >>> Inlined split_me.part.0 into test which now has time 70.760000 and size >>> 61, net change of +17. >>> >>> Considering split_me.part.0/5 with 23 size >>> to be inlined into test/2 in unknown:0 >>> Estimated badness is -0.000001, frequency 0.33. >>> Inlined split_me.part.0 into test which now has time 91.220000 and size >>> 78, net change of +17. >>> >>> Considering split_me.part.0/5 with 23 size >>> to be inlined into test/2 in unknown:0 >>> Estimated badness is -0.000001, frequency 0.33. >>> Inlined split_me.part.0 into test which now has time 111.680000 and size >>> 95, net change of +17. >>> Unit growth for small function inlining: 61->129 (111%) >>> >>> ... >>> >>> Any hint how to block the IPA inlining? >> >> I guess you only want to make cold part of split_me bigger or >> just use --param to reduce growth for auto inlining. >> >> How the patch reduces split_me_part considerably? >> Honza > > Well, I probably overlooked test results, test works fine. > > I'm going to install the patch. > Since this commit (r249450), I have noticed a regression: FAIL: gcc.dg/tree-ssa/ipa-split-5.c scan-tree-dump optimized "part" on aarch64/arm Christophe > Martin > >>> >>> Sending new version of patch. >>> Martin >>> >>>> >>>> I would just move pass_strip_predict_hints pre-IPA and not worry about >>>> them chaining. >>>> >>>> There is problem that after inlining the prediction may expand its scope >>>> and predict branch that it outside of the original function body, >>>> but I do not see very easy solution for that besides not re-doing >>>> prediction (we could also copy probabilities from the inlined function >>>> when they exists and honnor them in the outer function. I am not sure >>>> that is going to improve prediction quality though - extra context >>>> is probably useful) >>>> >>>> Thanks, >>>> Honza >>>>> >>>>> Thanks, >>>>> Martin >>>>> >>>>>> >>>>>> Where did you found this case? >>>>>> Honza >>>>>>> >>>>>>> /* Create a new deep copy of the statement. */ >>>>>>> copy = gimple_copy (stmt); >>>>>>> -- >>>>>>> 2.13.0 >>>>>>> >>> >> >>> >From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001 >>> From: marxin <mli...@suse.cz> >>> Date: Tue, 6 Jun 2017 10:55:18 +0200 >>> Subject: [PATCH 1/2] Make early return predictor more precise. >>> >>> gcc/ChangeLog: >>> >>> 2017-05-26 Martin Liska <mli...@suse.cz> >>> >>> PR tree-optimization/79489 >>> * gimplify.c (maybe_add_early_return_predict_stmt): New >>> function. >>> (gimplify_return_expr): Call the function. >>> * predict.c (tree_estimate_probability_bb): Remove handling >>> of early return. >>> * predict.def: Update comment about early return predictor. >>> * gimple-predict.h (is_gimple_predict): New function. >>> * predict.def: Change default value of early return to 66. >>> * tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT >>> statements. >>> * passes.def: Put pass_strip_predict_hints to the beginning of >>> IPA passes. >>> --- >>> gcc/gimple-low.c | 2 ++ >>> gcc/gimple-predict.h | 8 ++++++++ >>> gcc/gimplify.c | 16 ++++++++++++++++ >>> gcc/passes.def | 1 + >>> gcc/predict.c | 41 ----------------------------------------- >>> gcc/predict.def | 15 +++------------ >>> gcc/tree-tailcall.c | 2 ++ >>> 7 files changed, 32 insertions(+), 53 deletions(-) >>> >>> diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c >>> index 619b9d7bfb1..4ea6c3532f3 100644 >>> --- a/gcc/gimple-low.c >>> +++ b/gcc/gimple-low.c >>> @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. If not see >>> #include "calls.h" >>> #include "gimple-iterator.h" >>> #include "gimple-low.h" >>> +#include "predict.h" >>> +#include "gimple-predict.h" >>> >>> /* The differences between High GIMPLE and Low GIMPLE are the >>> following: >>> diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h >>> index ba58e12e9e9..0e6c2e1ea01 100644 >>> --- a/gcc/gimple-predict.h >>> +++ b/gcc/gimple-predict.h >>> @@ -80,4 +80,12 @@ gimple_build_predict (enum br_predictor predictor, enum >>> prediction outcome) >>> return p; >>> } >>> >>> +/* Return true if GS is a GIMPLE_PREDICT statement. */ >>> + >>> +static inline bool >>> +is_gimple_predict (const gimple *gs) >>> +{ >>> + return gimple_code (gs) == GIMPLE_PREDICT; >>> +} >>> + >>> #endif /* GCC_GIMPLE_PREDICT_H */ >>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >>> index 9af95a28704..1c6e1591953 100644 >>> --- a/gcc/gimplify.c >>> +++ b/gcc/gimplify.c >>> @@ -1428,6 +1428,20 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p) >>> return GS_ALL_DONE; >>> } >>> >>> +/* Maybe add early return predict statement to PRE_P sequence. */ >>> + >>> +static void >>> +maybe_add_early_return_predict_stmt (gimple_seq *pre_p) >>> +{ >>> + /* If we are not in a conditional context, add PREDICT statement. */ >>> + if (gimple_conditional_context ()) >>> + { >>> + gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN, >>> + NOT_TAKEN); >>> + gimplify_seq_add_stmt (pre_p, predict); >>> + } >>> +} >>> + >>> /* Gimplify a RETURN_EXPR. If the expression to be returned is not a >>> GIMPLE value, it is assigned to a new temporary and the statement is >>> re-written to return the temporary. >>> @@ -1458,6 +1472,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p) >>> || TREE_CODE (ret_expr) == RESULT_DECL >>> || ret_expr == error_mark_node) >>> { >>> + maybe_add_early_return_predict_stmt (pre_p); >>> greturn *ret = gimple_build_return (ret_expr); >>> gimple_set_no_warning (ret, TREE_NO_WARNING (stmt)); >>> gimplify_seq_add_stmt (pre_p, ret); >>> @@ -1525,6 +1540,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p) >>> >>> gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p); >>> >>> + maybe_add_early_return_predict_stmt (pre_p); >>> ret = gimple_build_return (result); >>> gimple_set_no_warning (ret, TREE_NO_WARNING (stmt)); >>> gimplify_seq_add_stmt (pre_p, ret); >>> diff --git a/gcc/passes.def b/gcc/passes.def >>> index c14f6b9f63c..316e19d12e3 100644 >>> --- a/gcc/passes.def >>> +++ b/gcc/passes.def >>> @@ -107,6 +107,7 @@ along with GCC; see the file COPYING3. If not see >>> early optimizations again. It is thus good idea to do this >>> late. */ >>> NEXT_PASS (pass_split_functions); >>> + NEXT_PASS (pass_strip_predict_hints); >>> POP_INSERT_PASSES () >>> NEXT_PASS (pass_release_ssa_names); >>> NEXT_PASS (pass_rebuild_cgraph_edges); >>> diff --git a/gcc/predict.c b/gcc/predict.c >>> index 60d1a094d10..790be9fbf16 100644 >>> --- a/gcc/predict.c >>> +++ b/gcc/predict.c >>> @@ -2739,7 +2739,6 @@ tree_estimate_probability_bb (basic_block bb, bool >>> local_only) >>> { >>> edge e; >>> edge_iterator ei; >>> - gimple *last; >>> >>> FOR_EACH_EDGE (e, ei, bb->succs) >>> { >>> @@ -2766,46 +2765,6 @@ tree_estimate_probability_bb (basic_block bb, bool >>> local_only) >>> } >>> } >>> >>> - /* Predict early returns to be probable, as we've already taken >>> - care for error returns and other cases are often used for >>> - fast paths through function. >>> - >>> - Since we've already removed the return statements, we are >>> - looking for CFG like: >>> - >>> - if (conditional) >>> - { >>> - .. >>> - goto return_block >>> - } >>> - some other blocks >>> - return_block: >>> - return_stmt. */ >>> - if (e->dest != bb->next_bb >>> - && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) >>> - && single_succ_p (e->dest) >>> - && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun) >>> - && (last = last_stmt (e->dest)) != NULL >>> - && gimple_code (last) == GIMPLE_RETURN) >>> - { >>> - edge e1; >>> - edge_iterator ei1; >>> - >>> - if (single_succ_p (bb)) >>> - { >>> - FOR_EACH_EDGE (e1, ei1, bb->preds) >>> - if (!predicted_by_p (e1->src, PRED_NULL_RETURN) >>> - && !predicted_by_p (e1->src, PRED_CONST_RETURN) >>> - && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN)) >>> - predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN); >>> - } >>> - else >>> - if (!predicted_by_p (e->src, PRED_NULL_RETURN) >>> - && !predicted_by_p (e->src, PRED_CONST_RETURN) >>> - && !predicted_by_p (e->src, PRED_NEGATIVE_RETURN)) >>> - predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN); >>> - } >>> - >>> /* Look for block we are guarding (ie we dominate it, >>> but it doesn't postdominate us). */ >>> if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb >>> diff --git a/gcc/predict.def b/gcc/predict.def >>> index fcda6c48f11..f7b2bf7738c 100644 >>> --- a/gcc/predict.def >>> +++ b/gcc/predict.def >>> @@ -128,18 +128,9 @@ DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic >>> call", HITRATE (59), 0) >>> indefinitely. */ >>> DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0) >>> >>> -/* Branch causing function to terminate is probably not taken. >>> - FIXME: early return currently predicts code: >>> - int foo (int a) >>> - { >>> - if (a) >>> - bar(); >>> - else >>> - bar2(); >>> - } >>> - even though there is no return statement involved. We probably want to >>> track >>> - this from FE or retire the predictor. */ >>> -DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE >>> (54), 0) >>> +/* Branch causing function to terminate is probably not taken. */ >>> +DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE >>> (66), >>> + 0) >>> >>> /* Branch containing goto is probably not taken. >>> FIXME: Currently not used. */ >>> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c >>> index b7053387e91..6aa9a56462e 100644 >>> --- a/gcc/tree-tailcall.c >>> +++ b/gcc/tree-tailcall.c >>> @@ -421,6 +421,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret) >>> if (gimple_code (stmt) == GIMPLE_LABEL >>> || gimple_code (stmt) == GIMPLE_RETURN >>> || gimple_code (stmt) == GIMPLE_NOP >>> + || gimple_code (stmt) == GIMPLE_PREDICT >>> || gimple_clobber_p (stmt) >>> || is_gimple_debug (stmt)) >>> continue; >>> @@ -555,6 +556,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret) >>> >>> if (gimple_code (stmt) == GIMPLE_LABEL >>> || gimple_code (stmt) == GIMPLE_NOP >>> + || gimple_code (stmt) == GIMPLE_PREDICT >>> || gimple_clobber_p (stmt) >>> || is_gimple_debug (stmt)) >>> continue; >>> -- >>> 2.13.1 >>> >> >