On Tue, Oct 21, 2014 at 3:20 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Richard, > > Yes, This patch does not make sense since phi node predication for bb > with critical incoming edges only performs another function which is > absent (predicate_extended_scalar_phi). > > BTW I see that commit_edge_insertions() is used for rtx instructions > only but you propose to use it for tree also. > Did I miss something?
Ah, it's gsi_commit_edge_inserts () (or gsi_commit_one_edge_insert if you want easy access to the newly created basic block to push the predicate to - see gsi_commit_edge_inserts implementation). Richard. > Thanks ahead. > > > 2014-10-21 16:44 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >> On Tue, Oct 21, 2014 at 2:25 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Richard, >>> >>> I did some changes in patch and ChangeLog to mark that support for >>> if-convert of blocks with only critical incoming edges will be added >>> in the future (more precise in patch.4). >> >> But the same reasoning applies to this version of the patch when >> flag_force_vectorize is true!? (insertion point and invalid SSA form) >> >> Which means the patch doesn't make sense in isolation? >> >> Btw, I think for the case you should simply do gsi_insert_on_edge () >> and commit_edge_insertions () before the call to combine_blocks >> (pushing the edge predicate to the newly created block). >> >> Richard. >> >>> Could you please review it. >>> >>> Thanks. >>> >>> ChangeLog: >>> >>> 2014-10-21 Yuri Rumyantsev <ysrum...@gmail.com> >>> >>> (flag_force_vectorize): New variable. >>> (edge_predicate): New function. >>> (set_edge_predicate): New function. >>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list >>> if destination block of edge is not always executed. Set-up predicate >>> for critical edge. >>> (if_convertible_phi_p): Accept phi nodes with more than two args >>> if FLAG_FORCE_VECTORIZE was set-up. >>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>> (if_convertible_stmt_p): Fix up pre-function comments. >>> (all_preds_critical_p): New function. >>> (if_convertible_bb_p): Use call of all_preds_critical_p >>> to reject temporarily block if-conversion with incoming critical edges >>> if FLAG_FORCE_VECTORIZE was not set-up. This restriction will be deleted >>> after adding support for extended predication. >>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>> to compute predicate instead of fold_build2_loc. >>> Add zeroing of edge 'aux' field. >>> (find_phi_replacement_condition): Extend function interface: >>> it returns NULL if given phi node must be handled by means of >>> extended phi node predication. If number of predecessors of phi-block >>> is equal 2 and at least one incoming edge is not critical original >>> algorithm is used. >>> (tree_if_conversion): Temporarily set-up FLAG_FORCE_VECTORIZE to false. >>> Nullify 'aux' field of edges for blocks with two successors. >>> >>> 2014-10-20 17:55 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>> Richard, >>>> >>>> Thanks for your answer! >>>> >>>> In current implementation phi node conversion assume that one of >>>> incoming edge to bb containing given phi has at least one non-critical >>>> edge and choose it to insert predicated code. But if we choose >>>> critical edge we need to determine insert point and insertion >>>> direction (before/after) since in other case we can get invalid ssa >>>> form (use before def). This is done by my new function which is not in >>>> current patch ( I will present this patch later). SO I assume that we >>>> need to leave this patch as it is to not introduce new bugs. >>>> >>>> Thanks. >>>> Yuri. >>>> >>>> 2014-10-20 12:00 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> Richard, >>>>>> >>>>>> I reworked the patch as you proposed, but I didn't understand what >>>>>> did you mean by: >>>>>> >>>>>>>So please rework the patch so critical edges are always handled >>>>>>>correctly. >>>>>> >>>>>> In current patch flag_force_vectorize is used (1) to reject phi nodes >>>>>> with more than 2 arguments; (2) to reject basic blocks with only >>>>>> critical incoming edges since support for extended predication of phi >>>>>> nodes will be in next patch. >>>>> >>>>> I mean that (2) should not be rejected dependent on flag_force_vectorize. >>>>> It was rejected because if-cvt couldn't handle it correctly before but >>>>> with >>>>> this patch this is fixed. I see no reason to still reject this then even >>>>> for !flag_force_vectorize. >>>>> >>>>> Rejecting PHIs with more than two arguments with flag_force_vectorize >>>>> is ok. >>>>> >>>>> Richard. >>>>> >>>>>> Could you please clarify your statement. >>>>>> >>>>>> I attached modified patch. >>>>>> >>>>>> ChangeLog: >>>>>> >>>>>> 2014-10-17 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> >>>>>> (flag_force_vectorize): New variable. >>>>>> (edge_predicate): New function. >>>>>> (set_edge_predicate): New function. >>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list >>>>>> if destination block of edge is not always executed. Set-up predicate >>>>>> for critical edge. >>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>> (all_edges_are_critical): New function. >>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p >>>>>> to reject block if-conversion with incoming critical edges only if >>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>> to compute predicate instead of fold_build2_loc. >>>>>> Add zeroing of edge 'aux' field. >>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>> it returns NULL if given phi node must be handled by means of >>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>> algorithm is used. >>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false. >>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> 2014-10-17 13:09 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> wrote: >>>>>>>> Richard, >>>>>>>> >>>>>>>> Here is reduced patch as you requested. All your remarks have been >>>>>>>> fixed. >>>>>>>> Could you please look at it ( I have already sent the patch with >>>>>>>> changes in add_to_predicate_list for review). >>>>>>> >>>>>>> + if (dump_file && (dump_flags & TDF_DETAILS)) >>>>>>> + fprintf (dump_file, "More than two phi node args.\n"); >>>>>>> + return false; >>>>>>> + } >>>>>>> + >>>>>>> + } >>>>>>> >>>>>>> Excess vertical space. >>>>>>> >>>>>>> >>>>>>> +/* Assumes that BB has more than 2 predecessors. >>>>>>> >>>>>>> More than 1 predecessor? >>>>>>> >>>>>>> + Returns false if at least one successor is not on critical edge >>>>>>> + and true otherwise. */ >>>>>>> + >>>>>>> +static inline bool >>>>>>> +all_edges_are_critical (basic_block bb) >>>>>>> +{ >>>>>>> >>>>>>> "all_preds_critical_p" would be a better name >>>>>>> >>>>>>> + if (EDGE_COUNT (bb->preds) > 2) >>>>>>> + { >>>>>>> + if (!flag_force_vectorize) >>>>>>> + return false; >>>>>>> + } >>>>>>> >>>>>>> as I said in the last review I don't think we should restrict edge >>>>>>> predicates to flag_force_vectorize. At least I can't see how >>>>>>> if-conversion is magically more expensive for that case? >>>>>>> >>>>>>> So please rework the patch so critical edges are always handled >>>>>>> correctly. >>>>>>> >>>>>>> Ok with that and the above suggested changes. >>>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>> >>>>>>>> Thanks. >>>>>>>> Yuri. >>>>>>>> ChangeLog >>>>>>>> 2014-10-16 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> >>>>>>>> (flag_force_vectorize): New variable. >>>>>>>> (edge_predicate): New function. >>>>>>>> (set_edge_predicate): New function. >>>>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list >>>>>>>> if destination block of edge is not always executed. Set-up predicate >>>>>>>> for critical edge. >>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>> (all_edges_are_critical): New function. >>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>>> Add zeroing of edge 'aux' field. >>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>> algorithm is used. >>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false. >>>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2014-10-15 13:50 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>>> On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev >>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>> Richard, >>>>>>>>>> >>>>>>>>>> Here is updated patch (part1) for extended if conversion. >>>>>>>>>> >>>>>>>>>> Second part of patch will be sent later. >>>>>>>>> >>>>>>>>> Ok, I'm starting to look at this. I'd still like you to split things >>>>>>>>> up >>>>>>>>> more. >>>>>>>>> >>>>>>>>> static inline void >>>>>>>>> add_to_predicate_list (struct loop *loop, basic_block bb, tree nc) >>>>>>>>> { >>>>>>>>> ... >>>>>>>>> >>>>>>>>> + /* We use notion of cd equivalence to get simplier predicate >>>>>>>>> for >>>>>>>>> + join block, e.g. if join block has 2 predecessors with >>>>>>>>> predicates >>>>>>>>> + p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of >>>>>>>>> + p1 & p2 | p1 & !p2. */ >>>>>>>>> + if (dom_bb != loop->header >>>>>>>>> + && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == >>>>>>>>> bb) >>>>>>>>> + { >>>>>>>>> + gcc_assert (flow_bb_inside_loop_p (loop, dom_bb)); >>>>>>>>> + bc = bb_predicate (dom_bb); >>>>>>>>> + gcc_assert (!is_true_predicate (bc)); >>>>>>>>> >>>>>>>>> these changes look worthwhile even for !flag_force_vectorize. So >>>>>>>>> please >>>>>>>>> split the change to add_to_predicate_list out and compute >>>>>>>>> post-dominators >>>>>>>>> unconditionally. Note that you should call free_dominance_info >>>>>>>>> (CDI_POST_DOMINATORS) at the end of if-conversion. >>>>>>>>> >>>>>>>>> + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest)) >>>>>>>>> + add_to_predicate_list (loop, e->dest, cond); >>>>>>>>> + >>>>>>>>> + /* If edge E is critical save predicate on it. */ >>>>>>>>> + if (EDGE_COUNT (e->dest->preds) >= 2) >>>>>>>>> + set_edge_predicate (e, cond); >>>>>>>>> >>>>>>>>> how do we know the edge is critical by this simple check? Why not >>>>>>>>> simply always save edge predicates (well, you kind of do but omit >>>>>>>>> the case where e->src dominates e->dest). >>>>>>>>> >>>>>>>>> Btw, you can rely on edge->aux being NULL at the start of the >>>>>>>>> pass but need to clear it at the end (best use clear_aux_for_edges () >>>>>>>>> for that). So stuff like >>>>>>>>> >>>>>>>>> + extract_true_false_edges_from_block (bb, &true_edge, >>>>>>>>> &false_edge); >>>>>>>>> + if (flag_force_vectorize) >>>>>>>>> + true_edge->aux = false_edge->aux = NULL; >>>>>>>>> >>>>>>>>> shouldn't be necessary. >>>>>>>>> >>>>>>>>> I think the edge predicate handling should also be unconditionally >>>>>>>>> and not depend on flag_force_vectorize. >>>>>>>>> >>>>>>>>> + /* The loop latch and loop exit block are always executed and >>>>>>>>> + have no extra conditions to be processed: skip them. */ >>>>>>>>> + if (bb == loop->latch >>>>>>>>> + || bb_with_exit_edge_p (loop, bb)) >>>>>>>>> >>>>>>>>> I don't think the edge stuff is true - given you still only reset the >>>>>>>>> loop->latch bb predicate the change looks broken. >>>>>>>>> >>>>>>>>> + /* Fold_build2 can produce bool conversion which is not >>>>>>>>> + supported by vectorizer, so re-build it without folding. >>>>>>>>> + For example, such conversion is generated for sequence: >>>>>>>>> + _Bool _7, _8, _9; >>>>>>>>> + _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9; >>>>>>>>> + if (_9 != 0) --> (bool)_9. */ >>>>>>>>> + >>>>>>>>> + if (CONVERT_EXPR_P (c) >>>>>>>>> + && TREE_CODE_CLASS (code) == tcc_comparison) >>>>>>>>> >>>>>>>>> I think you should simply use canonicalize_cond_expr_cond on the >>>>>>>>> folding result. Or rather _not_ fold at all - we are taking the >>>>>>>>> operands from the GIMPLE condition unmodified after all. >>>>>>>>> >>>>>>>>> - add_to_dst_predicate_list (loop, false_edge, >>>>>>>>> - unshare_expr (cond), c2); >>>>>>>>> + add_to_dst_predicate_list (loop, false_edge, unshare_expr >>>>>>>>> (cond), >>>>>>>>> + unshare_expr (c2)); >>>>>>>>> >>>>>>>>> why is it necessary to unshare c2? >>>>>>>>> >>>>>>>>> Please split out the PHI-with-multi-arg handling (I have not looked >>>>>>>>> at >>>>>>>>> that in detail). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>> >>>>>>>>>> Changelog. >>>>>>>>>> >>>>>>>>>> 2014-10-13 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>> >>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>> clone. >>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>> (edge_predicate): New function. >>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always >>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>> for join blocks if it exists. >>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>> destination block of edge is not always executed. Set-up predicate >>>>>>>>>> for critical edge. >>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using >>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's. >>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>> algorithm is used. >>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals >>>>>>>>>> that extended predication must be applied). >>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current >>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning >>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of >>>>>>>>>> edges >>>>>>>>>> for blocks with two successors. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>>>>> Richard, >>>>>>>>>>> >>>>>>>>>>> here is reduced patch (part.1) which was reduced almost twice. >>>>>>>>>>> Let's me also answer on your comments. >>>>>>>>>>> >>>>>>>>>>> 1. I really use edge field 'aux' to keep predicate for critical >>>>>>>>>>> edges. >>>>>>>>>>> My previous code was not correct and now it looks like: >>>>>>>>>>> >>>>>>>>>>> if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == >>>>>>>>>>> 1) >>>>>>>>>>> /* Edge E is not critical, use predicate of edge source bb. */ >>>>>>>>>>> c = bb_predicate (b); >>>>>>>>>>> else >>>>>>>>>>> /* Edge E is critical and its aux field contains predicate. */ >>>>>>>>>>> c = edge_predicate (e); >>>>>>>>>>> >>>>>>>>>>> 2. I completely delete all code related to creation of conditional >>>>>>>>>>> expressions and completely rely on bool pattern recognition in >>>>>>>>>>> vectorizer. But we need to delete all dead predicate computations >>>>>>>>>>> which are not used since they prevent vectorization. I will add this >>>>>>>>>>> local-dce function in next patch. >>>>>>>>>>> 3. I also did not include in this patch recognition of general >>>>>>>>>>> phi-nodes with two arguments only for which conversion of >>>>>>>>>>> conditional >>>>>>>>>>> scalar reduction can be applied also. >>>>>>>>>>> Note that all these changes are applied for loop marked with pragma >>>>>>>>>>> omp simd only. >>>>>>>>>>> >>>>>>>>>>> 2014-09-22 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>> >>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>>> clone. >>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always >>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>>> for join blocks if it exists. >>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>>> destination block of edge is not always executed. Set-up predicate >>>>>>>>>>> for critical edge. >>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using >>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's. >>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>> phi-block >>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>> algorithm is used. >>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals >>>>>>>>>>> that extended predication must be applied). >>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current >>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>> versioning >>>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of >>>>>>>>>>> edges >>>>>>>>>>> for blocks with two successors. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2014-09-08 17:10 GMT+04:00 Richard Biener >>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev >>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>> Richard! >>>>>>>>>>>>> Here is updated patch with the following changes: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Any restrictions on phi-function were eliminated for extended >>>>>>>>>>>>> conversion. >>>>>>>>>>>>> 2. Put predicate for critical edges to 'aux' field of edge, i.e. >>>>>>>>>>>>> negate_predicate was deleted. >>>>>>>>>>>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges >>>>>>>>>>>>> can >>>>>>>>>>>>> be critical. >>>>>>>>>>>>> 4. Use notion of cd-equivalence to set-up predicate for join basic >>>>>>>>>>>>> blocks to simplify it. >>>>>>>>>>>>> 5. I decided to not design pre-pass since it will lead generating >>>>>>>>>>>>> chain of cond expressions for phi-node if conversion, whereas for >>>>>>>>>>>>> phi >>>>>>>>>>>>> of kind >>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>> only one cond expression is required and this is considered as >>>>>>>>>>>>> simple >>>>>>>>>>>>> optimization for arbitrary phi-function. More precise, >>>>>>>>>>>>> if phi-function have only two different arguments and one of them >>>>>>>>>>>>> has >>>>>>>>>>>>> single occurrence, if- conversion is performed as if phi have >>>>>>>>>>>>> only 2 >>>>>>>>>>>>> arguments. >>>>>>>>>>>>> For arbitrary phi function a chain of cond expressions is >>>>>>>>>>>>> produced. >>>>>>>>>>>>> >>>>>>>>>>>>> Updated patch is attached. >>>>>>>>>>>>> >>>>>>>>>>>>> Any comments will be appreciated. >>>>>>>>>>>> >>>>>>>>>>>> The patch is still very big and does multiple things at once which >>>>>>>>>>>> makes >>>>>>>>>>>> it hard to review. >>>>>>>>>>>> >>>>>>>>>>>> In addition to that it changes function singatures without updating >>>>>>>>>>>> the function comments. For example what is the convert_bool >>>>>>>>>>>> argument doing to add_to_dst_predicate_list? Why do we need >>>>>>>>>>>> all this added logic. >>>>>>>>>>>> >>>>>>>>>>>> You duplicate operand_equal_for_phi_arg_p. >>>>>>>>>>>> >>>>>>>>>>>> I think the code handling PHIs with more than two operands but >>>>>>>>>>>> only two unequal operands is useful generally, so that's an obvious >>>>>>>>>>>> candidate for splitting out into a separate patch. >>>>>>>>>>>> >>>>>>>>>>>> + CONVERT_BOOL argument was added to convert bool predicate >>>>>>>>>>>> computations >>>>>>>>>>>> + which is not supported by vectorizer to int type through >>>>>>>>>>>> creating of >>>>>>>>>>>> + conditional expressions. */ >>>>>>>>>>>> >>>>>>>>>>>> Example? The vectorizer has patterns for bool predicate >>>>>>>>>>>> computations. >>>>>>>>>>>> This seems to be another feature that needs splitting out. >>>>>>>>>>>> >>>>>>>>>>>> The way you get around the critical edge parts looks awkward to me. >>>>>>>>>>>> Please either do _all_ predicates as edge predicates or simply >>>>>>>>>>>> split critical edges (of the respective loop body). >>>>>>>>>>>> >>>>>>>>>>>> I still think that an utility doing same PHI arg merging by >>>>>>>>>>>> introducing >>>>>>>>>>>> forwarder blocks would be nicer to have. >>>>>>>>>>>> >>>>>>>>>>>> I'd restructure the main tree_if_conversion function to apply these >>>>>>>>>>>> CFG pre-transforms when we are going to version the loop >>>>>>>>>>>> for if conversion (eventually transitioning to always doing that). >>>>>>>>>>>> >>>>>>>>>>>> So - please split up the patch. It's way too big. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Richard. >>>>>>>>>>>> >>>>>>>>>>>>> 2014-08-15 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>>>>> clone. >>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function. >>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate field. >>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE. >>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>> Use predicate of cd-equivalent block if convert_bool is true and >>>>>>>>>>>>> such bb exists; save it in static variable for further possible >>>>>>>>>>>>> use. >>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is >>>>>>>>>>>>> true. >>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>> Add early function exit if edge target block is always executed. >>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is >>>>>>>>>>>>> true. >>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>> Set-up predicate for crritical edge if convert_bool is true. >>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>>> if flag_force_vectorize wa set-up. >>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize. >>>>>>>>>>>>> (if_convertible_stmt_p): Allow calls of function clones if >>>>>>>>>>>>> flag_force_vectorize was set-up. >>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>>>>> flag_force_vectorize was set-up. Use call of >>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only if >>>>>>>>>>>>> flag_force_vectorize was not set-up. >>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument which is used to >>>>>>>>>>>>> transform >>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>> expressions >>>>>>>>>>>>> with integral operands. If convert_bool argument was set-up and >>>>>>>>>>>>> vect bool pattern can be appied perform the following >>>>>>>>>>>>> transformation: >>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>> Add check that if fold_build2 produces bool conversion if >>>>>>>>>>>>> convert_bool >>>>>>>>>>>>> was set-up, recompute predicate using build2_loc. Additional >>>>>>>>>>>>> argument >>>>>>>>>>>>> 'convert_bool" is passed to add_to_dst_predicate_list and >>>>>>>>>>>>> add_to_predicate_list. >>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>>> flag_force_vectorize was set-up to calculate cd equivalent bb's. >>>>>>>>>>>>> Call predicate_bbs with additional argument equal to false. >>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>> phi-block >>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals >>>>>>>>>>>>> that >>>>>>>>>>>>> phi arguments must be evaluated through >>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if >>>>>>>>>>>>> cond >>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>> (predicate_arbitrary_phi): New function. >>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement >>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion. >>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended >>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs. >>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current >>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>>> versioning >>>>>>>>>>>>> for innermost loop marked with pragma omp simd. >>>>>>>>>>>>> >>>>>>>>>>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener >>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev >>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We implemented additional support for pragma omp simd in part of >>>>>>>>>>>>>>> extended if-conversion loops with such pragma. These extensions >>>>>>>>>>>>>>> include: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. All extensions are performed only if considered loop or its >>>>>>>>>>>>>>> outer >>>>>>>>>>>>>>> loop was marked with pragma omp simd (force_vectorize); For >>>>>>>>>>>>>>> ordinary >>>>>>>>>>>>>>> loops behavior was not changed. >>>>>>>>>>>>>>> 2. Took off cfg restriction on basic block which can have more >>>>>>>>>>>>>>> than 2 >>>>>>>>>>>>>>> predecessors. >>>>>>>>>>>>>>> 3. Put additional restriction on phi nodes which was missed in >>>>>>>>>>>>>>> current design: >>>>>>>>>>>>>>> all phi nodes must be in non-predicated basic block to >>>>>>>>>>>>>>> conform >>>>>>>>>>>>>>> semantic of COND_EXPR which is used for transformation. >>>>>>>>>>>>>> >>>>>>>>>>>>>> How is that so? If the PHI is predicated then its result will >>>>>>>>>>>>>> be used >>>>>>>>>>>>>> in a PHI node again and thus we'd create a sequence of >>>>>>>>>>>>>> COND_EXPRs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> No? >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 4. Extend predication of phi nodes: phi may have more than 2 >>>>>>>>>>>>>>> arguments >>>>>>>>>>>>>>> with some limitations: >>>>>>>>>>>>>>> - for phi nodes which have more than 2 arguments, but only >>>>>>>>>>>>>>> two >>>>>>>>>>>>>>> arguments are different and one of them has the only >>>>>>>>>>>>>>> occurence, >>>>>>>>>>>>>>> transformation to single COND_EXPR can be done. >>>>>>>>>>>>>>> - if phi node has more different arguments and all edge >>>>>>>>>>>>>>> predicates >>>>>>>>>>>>>>> correspondent to phi-arguments are disjoint, a chain of >>>>>>>>>>>>>>> COND_EXPR >>>>>>>>>>>>>>> will be generated for it. In current design very simple >>>>>>>>>>>>>>> check is used: >>>>>>>>>>>>>>> check starting from end that two edges correspondent to >>>>>>>>>>>>>>> neighbor >>>>>>>>>>>>>>> arguments have common predecessor which is used for further >>>>>>>>>>>>>>> check >>>>>>>>>>>>>>> with next edge. >>>>>>>>>>>>>>> These guarantee that phi predication will produce the correct >>>>>>>>>>>>>>> result. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Btw, you can think of these extensions as unfactoring a PHI node >>>>>>>>>>>>>> by >>>>>>>>>>>>>> inserting forwarder blocks. Thus >>>>>>>>>>>>>> >>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> becomes >>>>>>>>>>>>>> >>>>>>>>>>>>>> bb 5: <forwarder-from(2)-and(3)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> x = PHI <1(5), 2(4)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> and >>>>>>>>>>>>>> >>>>>>>>>>>>>> x = PHI <1(2), 2(3), 3(4)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> becomes >>>>>>>>>>>>>> >>>>>>>>>>>>>> bb 5: >>>>>>>>>>>>>> x' = PHI <1(2), 2(3)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> b = PHI<x'(5), 3(4)> >>>>>>>>>>>>>> >>>>>>>>>>>>>> which means that 3) has to work. Note that we want this kind of >>>>>>>>>>>>>> PHI transforms for out-of-SSA as well to reduce the number of >>>>>>>>>>>>>> copies we need to insert on edges. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thus it would be nice if you implemented 4) in terms of a >>>>>>>>>>>>>> pre-pass >>>>>>>>>>>>>> over the force_vect loops PHI nodes, applying that CFG transform. >>>>>>>>>>>>>> And make 3) work properly if it doesn't already. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It looks like you introduce a "negate predicate" to work around >>>>>>>>>>>>>> the >>>>>>>>>>>>>> critical edge limitation? Please instead change if-conversion to >>>>>>>>>>>>>> work with edge predicates (as opposed to BB predicates). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here is example of such extended predication (compile with >>>>>>>>>>>>>>> -march=core-avx2): >>>>>>>>>>>>>>> #pragma omp simd safelen(8) >>>>>>>>>>>>>>> for (i=0; i<512; i++) >>>>>>>>>>>>>>> { >>>>>>>>>>>>>>> float t = a[i]; >>>>>>>>>>>>>>> if (t > 0 & t < 1.0e+17f) >>>>>>>>>>>>>>> if (c[i] != 0) >>>>>>>>>>>>>>> res += 1; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> <bb 4>: >>>>>>>>>>>>>>> # res_15 = PHI <res_1(5), 0(3)> >>>>>>>>>>>>>>> # i_16 = PHI <i_11(5), 0(3)> >>>>>>>>>>>>>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)> >>>>>>>>>>>>>>> t_5 = a[i_16]; >>>>>>>>>>>>>>> _6 = t_5 > 0.0; >>>>>>>>>>>>>>> _7 = t_5 < 9.9999998430674944e+16; >>>>>>>>>>>>>>> _8 = _7 & _6; >>>>>>>>>>>>>>> _ifc__28 = (unsigned int) _8; >>>>>>>>>>>>>>> _10 = &c[i_16]; >>>>>>>>>>>>>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0; >>>>>>>>>>>>>>> _9 = MASK_LOAD (_10, 0B, _ifc__36); >>>>>>>>>>>>>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>> _ifc__30 = (int) _ifc__29; >>>>>>>>>>>>>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0; >>>>>>>>>>>>>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>> _ifc__33 = (int) _ifc__32; >>>>>>>>>>>>>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0; >>>>>>>>>>>>>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0; >>>>>>>>>>>>>>> res_1 = res_15 + _ifc__35; >>>>>>>>>>>>>>> i_11 = i_16 + 1; >>>>>>>>>>>>>>> ivtmp_14 = ivtmp_17 - 1; >>>>>>>>>>>>>>> if (ivtmp_14 != 0) >>>>>>>>>>>>>>> goto <bb 4>; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new failures. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> gcc/ChageLog >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2014-06-25 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable. >>>>>>>>>>>>>>> (struct bb_predicate_s): Add negate_predicate field. >>>>>>>>>>>>>>> (bb_negate_predicate): New function. >>>>>>>>>>>>>>> (set_bb_negate_predicate): New function. >>>>>>>>>>>>>>> (bb_copy_predicate): New function. >>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function. >>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate >>>>>>>>>>>>>>> field. >>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE. >>>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is >>>>>>>>>>>>>>> true. >>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>> Add early function exit if edge target block is always executed. >>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is >>>>>>>>>>>>>>> true. >>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>>>> (phi_args_disjoint): New function. >>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>>>>> for loops marked with pragma omp simd. Add check that phi nodes >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>> in non-predicated basic blocks. >>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize. >>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> flag_force_vectorize was setup. Use call of >>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> flag_force_vectorize was not setup. >>>>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument that is used to >>>>>>>>>>>>>>> transform >>>>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>>>> expressions >>>>>>>>>>>>>>> with integral operands. If bool_conv argument is false or both >>>>>>>>>>>>>>> outgoing edges are not critical old algorithm of predicate >>>>>>>>>>>>>>> assignments >>>>>>>>>>>>>>> is used, otherwise the following code was added: check on >>>>>>>>>>>>>>> applicable >>>>>>>>>>>>>>> of vect-bool-pattern recognition and trnasformation of >>>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>>>> compute predicates for both outgoing edges one of which is >>>>>>>>>>>>>>> critical >>>>>>>>>>>>>>> one using 'normal' edge, i.e. compute true and false predicates >>>>>>>>>>>>>>> using >>>>>>>>>>>>>>> normal outgoing edge only; evaluated predicates are stored in >>>>>>>>>>>>>>> predicate and negate_predicate fields of struct bb_predicate_s >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> negate_predicate of normal edge conatins predicate of critical >>>>>>>>>>>>>>> edge, >>>>>>>>>>>>>>> but generated gimplified statements are stored in their >>>>>>>>>>>>>>> destination >>>>>>>>>>>>>>> block fields. Additional argument 'convert_bool" is passed to >>>>>>>>>>>>>>> add_to_dst_predicate_list and add_to_predicate_list. >>>>>>>>>>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional >>>>>>>>>>>>>>> argument >>>>>>>>>>>>>>> equal to false. >>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>> original >>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which >>>>>>>>>>>>>>> signals that >>>>>>>>>>>>>>> both phi arguments must be evaluated through >>>>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp >>>>>>>>>>>>>>> if cond >>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>> (predicate_phi_disjoint_args): New function. >>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement >>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion. >>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated >>>>>>>>>>>>>>> basic >>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended >>>>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs. >>>>>>>>>>>>>>> (split_crit_edge): New function. >>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>>> current >>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare). Invoke >>>>>>>>>>>>>>> split_crit_edge for extended predication. Do loop versioning for >>>>>>>>>>>>>>> innermost loop marked with pragma omp simd.