Hi, Michael, Since I was involved in other tasks, it is a little bit late to reply you. Sorry for that. I composed a new one with your suggestions. Please review that when you are in convenience.
> Generally I do like the idea of the transformation, and the basic building > blocks seem to be sound. But I dislike it being a separate pass, so > please integrate the code you have written into the existing loop split > pass. Most building blocks can be used as is, except the main driver. This new transformation was integrated into the pass of original loop split. >> +@item max-cond-loop-split-insns >> +The maximum number of insns to be increased due to loop split on >> +semi-invariant condition statement. > "to be increased" --> "to be generated" (or "added") Done. >> +@item min-cond-loop-split-prob >> +The minimum threshold for probability of semi-invaraint condition >> +statement to trigger loop split. > typo, semi-invariant Done. > I think somewhere in the docs your definition of semi-invariant needs > to be stated in some form (can be short, doesn't need to reproduce the > diagram or such), so don't just replicate the short info from the > params.def file. Done. >> +DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB, >> + "min-cond-loop-split-prob", >> + "The minimum threshold for probability of semi-invaraint condition " >> + "statement to trigger loop split.", > Same typo: "semi-invariant". Done. >> -/* This file implements loop splitting, i.e. transformation of loops like >> +/* This file implements two kind of loop splitting. > kind_s_, plural Done. >> +/* Another transformation of loops like: >> + >> + for (i = INIT (); CHECK (i); i = NEXT ()) >> + { >> + if (expr (a_1, a_2, ..., a_n)) >> + a_j = ...; // change at least one a_j >> + else >> + S; // not change any a_j >> + } > You should mention that 'expr' needs to be pure, i.e. once it > becomes false and the inputs don't change, that it remains false. Done. >> +static bool >> +branch_removable_p (basic_block branch_bb) >> +{ >> + if (single_pred_p (branch_bb)) >> + return true; >> + >> + edge e; >> + edge_iterator ei; >> + >> + FOR_EACH_EDGE (e, ei, branch_bb->preds) >> + { >> + if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb)) >> + continue; >> + >> + if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src)) >> + continue; > My gut feeling is surprised by this. So one of the predecessors of > branch_bb dominates it. Why should that indicate that branch_bb > can be safely removed? > > Think about something like this: > > esrc --> cond_bb --> branch_bb > '-------------------^ If all predecessors of branch_bb dominate it, these predecessors should also be in dominating relationship among them, and the conditional statement must be branch_bb's immediate dominator, and branch_bb is removable. In your example. For "esrc", loop is continued, nothing is impacted. But in the next iteration, we encounter "cond_bb", it does not dominate "branch_bb", so the function return false in the following return statement. > (cond_bb is the callers bb of the cond statement in question). Now esrc > dominates branch_bb but still you can't simply remove it, even if > the cond_bb->branch_bb edge becomes unexecutable. >> +static int >> +get_cond_invariant_branch (struct loop *loop, gcond *cond) > Please return an edge here, not an edge index (which removes the using of > -1). I think the name (and comment) should consistently talk about > semi-invariant, not invariant. For when you really need an edge index > later, you could use "EDGE_SUCC(bb, 0) != edge". But you probably don't > really need it, e.g. instead of using the gimple pass-local-flag on a > statement you can just as well also store the edge in your info structure. Done. >> +static bool >> +is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb, >> + int branch) > With above change in get_cond_invariant_branch, this also should > take an edge, not a bb+edge-index. Done. >> +static int >> +compute_added_num_insns (struct loop *loop, const_basic_block cond_bb, >> + int branch) > This should take an edge as well. Done. >> + for (unsigned i = 0; i < loop->num_nodes; i++) >> + { >> + /* Do no count basic blocks only in opposite branch. */ >> + if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var)) >> + continue; >> + >> + for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p >> (gsi); >> + gsi_next (&gsi)) >> + num += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights); > Replace the loop by > estimate_num_insn_seq (bb_seq (bbs[i]), &eni_size_weights); Done. >> + /* Add a threshold for increased code size to disable loop split. */ >> + if (compute_added_num_insns (loop, cond_bb, branch) > >> + PARAM_VALUE (PARAM_MAX_COND_LOOP_SPLIT_INSNS)) > Operator should be first on next line, not last on previous line. Done. >> + /* Skip conditional statement that is inside a nested unrecognized loop. >> */ >> + if (is_cond_in_hidden_loop (loop, cond_bb, branch)) >> + return false; > This part (as well as is_cond_in_hidden_loop implementation) had me > confused for a while, because "unrecognized" loops seems strange. I think > I now know what you try to do here, but I wonder if there's an easier way, > or at least about which situations you stumbled into that made you write > this code. Use BB_IRREDUCIBLE_LOOP flag to check that, for tree-loop-init pass requires LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS which marks irreducible loops. >> + >> + /* Temporarily keep branch index in conditional statement. */ >> + gimple_set_plf (cond, GF_PLF_1, branch); > i.e. here, store the edge in your info structure. Done. >> +/* Main entry point to perform loop splitting for suitable if-conditions >> + in all loops. */ >> + >> +static unsigned int >> +tree_ssa_split_loops_for_cond (void) > So, from here on the code should be integrated into the existing code > of the file (which might need changes as well for this integration to look > good). Done. Thanks, Feng