Richard, Did you have a chance to look at it?
Thanks. Yuri. 2014-10-24 14:21 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: > Richard, > > Patch containing new core related to extended predication is attached. > Here is few comments which explain a main goal of design. > > 1. I don't want to insert any critical edge splitting since it may > lead to less efficient binaries (I remember some performance issue > when we designed lazy code motion algorithm in SPARC compiler). > 2. One special case of extended PHI node predication was introduced > when #arguments is more than 2 but only two arguments are different > and one argument has the only occurrence. For such PHI conditional > scalar reduction is applied. > This is correspondent to the following: > if (q1 && q2 && q3) var++ > New function phi_has_two_different_args was introduced to detect such phi. > 3. Original algorithm for PHI predication used assumption that at > least one incoming edge for blocks containing PHI is not critical - it > guarantees that all computations related to predicate of normal edge > are already inserted above this block and > core related to PHI predication can be inserted at the beginning of > block. But this is not true for critical edges for which predicate > computations are in the block where code for phi predication must be > inserted. So new function find_insertion_point is introduced which is > simply found out the last statement in block defining predicates > correspondent to all incoming edges and insert phi predication code > after it (with some minor exceptions). > > If you need more comments or something unclear will let me know. > > Thanks. > Yuri. > > ChangeLog: > > 2014-10-24 Yuri Rumyantsev <ysrum...@gmail.com> > > * tree-if-conv.c (ifcvt_can_use_mask_load_store): Use > FLAG_FORCE_VECTORIZE instead of loop flag. > (if_convertible_bb_p): Allow bb has more than 2 predecessors if > FLAG_FORCE_VECTORIZE is true. > (if_convertible_bb_p): Delete check that bb has at least one > non-critical incoming edge. > (phi_has_two_different_args): New function. > (is_cond_scalar_reduction): Add argument EXTENDED to choose access > to phi arguments. Invoke phi_has_two_different_args to get phi > arguments if EXTENDED is true. Change check that block > containing reduction statement candidate is predecessor > of phi-block since phi may have more than two arguments. > (convert_scalar_cond_reduction): Add argument BEFORE to insert > statement before/after gsi point. > (predicate_scalar_phi): Add argument false (which means non-extended > predication) to call of is_cond_scalar_reduction. Add argument > true (which correspondent to argument BEFORE) to call of > convert_scalar_cond_reduction. > (get_predicate_for_edge): New function. > (predicate_arbitrary_scalar_phi): New function. > (predicate_extended_scalar_phi): New function. > (find_insertion_point): New function. > (predicate_all_scalar_phis): Add two boolean variables EXTENDED and > BEFORE. Initialize EXTENDED to true if BB containing phi has more > than 2 predecessors or both incoming edges are critical. Invoke > find_phi_replacement_condition and predicate_scalar_phi or > find_insertion_point and predicate_extended_scalar_phi depending on > EXTENDED value. > (insert_gimplified_predicates): Add check that non-predicated block > may have statements to insert. Insert predicate of BB just after label > if FLAG_FORCE_VECTORIZE is true. > (tree_if_conversion): Add initialization of FLAG_FORCE_VECTORIZE which > is copy of inner or outer loop field force_vectorize. > > > 2014-10-24 13:12 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >> On Tue, Oct 21, 2014 at 4:34 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Richard, >>> >>> In my initial design I did such splitting but before start real >>> if-conversion but I decided to not perform it since code size for >>> if-converted loop is growing (number of phi nodes is increased). It is >>> worth noting also that for phi with #nodes > 2 we need to get all >>> predicates (except one) to do phi-predication and it means that block >>> containing such phi can have only 1 critical edge. >> >> Can you point me to the patch with the special insertion code then? >> I definitely want to avoid the mess we ran into with the reassoc >> code "clever" insertion code. >> >> Richard. >> >>> Thanks. >>> Yuri. >>> >>> 2014-10-21 18:19 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>> On Tue, Oct 21, 2014 at 4:09 PM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Tue, Oct 21, 2014 at 3:58 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> Richard, >>>>>> >>>>>> I saw the sources of these functions, but I can't understand why I >>>>>> should use something else? Note that all predicate computations are >>>>>> located in basic blocks ( by design of if-conv) and there is special >>>>>> function that put these computations in bb >>>>>> (insert_gimplified_predicates). Edge contains only predicate not its >>>>>> computations. New function - find_insertion_point() does very simple >>>>>> search - it finds out the latest (in current bb) operand def-stmt of >>>>>> predicates taken from all incoming edges. >>>>>> In original algorithm the predicate of non-critical edge is taken to >>>>>> perform phi-node predication since for critical edge it does not work >>>>>> properly. >>>>>> >>>>>> My question is: does your comments mean that I should re-design my >>>>>> extensions? >>>>> >>>>> Well, we have infrastructure for inserting code on edges and you've >>>>> made critical edges predicated correctly. So why re-invent the wheel? >>>>> I realize this is very similar to my initial suggestion to simply split >>>>> critical edges in loops you want to if-convert but delays splitting >>>>> until it turns out to be necessary (which might be good for the >>>>> !force_vect case). >>>>> >>>>> For edge predicates you simply can emit their computation on the >>>>> edge, no? >>>>> >>>>> Btw, I very originally suggested to rework if-conversion to only >>>>> record edge predicates - having both block and edge predicates >>>>> somewhat complicates the code and makes it harder to >>>>> maintain (thus also the suggestion to simply split critical edges >>>>> if necessary to make BB predicates work always). >>>>> >>>>> Your patches add a lot of code and to me it seems we can avoid >>>>> doing so much special casing. >>>> >>>> For example attacking the critical edge issue by a simple >>>> >>>> Index: tree-if-conv.c >>>> =================================================================== >>>> --- tree-if-conv.c (revision 216508) >>>> +++ tree-if-conv.c (working copy) >>>> @@ -980,11 +980,7 @@ if_convertible_bb_p (struct loop *loop, >>>> if (EDGE_COUNT (e->src->succs) == 1) >>>> found = true; >>>> if (!found) >>>> - { >>>> - if (dump_file && (dump_flags & TDF_DETAILS)) >>>> - fprintf (dump_file, "only critical predecessors\n"); >>>> - return false; >>>> - } >>>> + split_edge (EDGE_PRED (bb, 0)); >>>> } >>>> >>>> return true; >>>> >>>> it changes the number of blocks in the loop, so >>>> get_loop_body_in_if_conv_order should probably be re-done with the >>>> above eventually signalling that it created a new block. Or the above >>>> should populate a vector of edges to split and do that after the >>>> loop calling if_convertible_bb_p. >>>> >>>> Richard. >>>> >>>>> Richard. >>>>> >>>>>> Thanks. >>>>>> Yuri. >>>>>> >>>>>> BTW Jeff did initial review of my changes related to predicate >>>>>> computation for join blocks. I presented him updated patch with >>>>>> test-case and some minor changes in patch. But still did not get any >>>>>> feedback on it. Could you please take a look also on it? >>>>>> >>>>>> >>>>>> 2014-10-21 17:38 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>> On Tue, Oct 21, 2014 at 3:20 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> wrote: >>>>>>>> Richard, >>>>>>>> >>>>>>>> Yes, This patch does not make sense since phi node predication for bb >>>>>>>> with critical incoming edges only performs another function which is >>>>>>>> absent (predicate_extended_scalar_phi). >>>>>>>> >>>>>>>> BTW I see that commit_edge_insertions() is used for rtx instructions >>>>>>>> only but you propose to use it for tree also. >>>>>>>> Did I miss something? >>>>>>> >>>>>>> Ah, it's gsi_commit_edge_inserts () (or gsi_commit_one_edge_insert >>>>>>> if you want easy access to the newly created basic block to push >>>>>>> the predicate to - see gsi_commit_edge_inserts implementation). >>>>>>> >>>>>>> Richard. >>>>>>> >>>>>>>> Thanks ahead. >>>>>>>> >>>>>>>> >>>>>>>> 2014-10-21 16:44 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>>> On Tue, Oct 21, 2014 at 2:25 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> Richard, >>>>>>>>>> >>>>>>>>>> I did some changes in patch and ChangeLog to mark that support for >>>>>>>>>> if-convert of blocks with only critical incoming edges will be added >>>>>>>>>> in the future (more precise in patch.4). >>>>>>>>> >>>>>>>>> But the same reasoning applies to this version of the patch when >>>>>>>>> flag_force_vectorize is true!? (insertion point and invalid SSA form) >>>>>>>>> >>>>>>>>> Which means the patch doesn't make sense in isolation? >>>>>>>>> >>>>>>>>> Btw, I think for the case you should simply do gsi_insert_on_edge () >>>>>>>>> and commit_edge_insertions () before the call to combine_blocks >>>>>>>>> (pushing the edge predicate to the newly created block). >>>>>>>>> >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>>> Could you please review it. >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> ChangeLog: >>>>>>>>>> >>>>>>>>>> 2014-10-21 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>> >>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>> (edge_predicate): New function. >>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>> (add_to_dst_predicate_list): Conditionally invoke >>>>>>>>>> add_to_predicate_list >>>>>>>>>> if destination block of edge is not always executed. Set-up predicate >>>>>>>>>> for critical edge. >>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>> (all_preds_critical_p): New function. >>>>>>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p >>>>>>>>>> to reject temporarily block if-conversion with incoming critical >>>>>>>>>> edges >>>>>>>>>> if FLAG_FORCE_VECTORIZE was not set-up. This restriction will be >>>>>>>>>> deleted >>>>>>>>>> after adding support for extended predication. >>>>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>>>>> Add zeroing of edge 'aux' field. >>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>>>>>> is equal 2 and at least one incoming edge is not critical original >>>>>>>>>> algorithm is used. >>>>>>>>>> (tree_if_conversion): Temporarily set-up FLAG_FORCE_VECTORIZE to >>>>>>>>>> false. >>>>>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>>>>> >>>>>>>>>> 2014-10-20 17:55 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>>>>> Richard, >>>>>>>>>>> >>>>>>>>>>> Thanks for your answer! >>>>>>>>>>> >>>>>>>>>>> In current implementation phi node conversion assume that one of >>>>>>>>>>> incoming edge to bb containing given phi has at least one >>>>>>>>>>> non-critical >>>>>>>>>>> edge and choose it to insert predicated code. But if we choose >>>>>>>>>>> critical edge we need to determine insert point and insertion >>>>>>>>>>> direction (before/after) since in other case we can get invalid ssa >>>>>>>>>>> form (use before def). This is done by my new function which is not >>>>>>>>>>> in >>>>>>>>>>> current patch ( I will present this patch later). SO I assume that >>>>>>>>>>> we >>>>>>>>>>> need to leave this patch as it is to not introduce new bugs. >>>>>>>>>>> >>>>>>>>>>> Thanks. >>>>>>>>>>> Yuri. >>>>>>>>>>> >>>>>>>>>>> 2014-10-20 12:00 GMT+04:00 Richard Biener >>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev >>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>> Richard, >>>>>>>>>>>>> >>>>>>>>>>>>> I reworked the patch as you proposed, but I didn't understand what >>>>>>>>>>>>> did you mean by: >>>>>>>>>>>>> >>>>>>>>>>>>>>So please rework the patch so critical edges are always handled >>>>>>>>>>>>>>correctly. >>>>>>>>>>>>> >>>>>>>>>>>>> In current patch flag_force_vectorize is used (1) to reject phi >>>>>>>>>>>>> nodes >>>>>>>>>>>>> with more than 2 arguments; (2) to reject basic blocks with only >>>>>>>>>>>>> critical incoming edges since support for extended predication of >>>>>>>>>>>>> phi >>>>>>>>>>>>> nodes will be in next patch. >>>>>>>>>>>> >>>>>>>>>>>> I mean that (2) should not be rejected dependent on >>>>>>>>>>>> flag_force_vectorize. >>>>>>>>>>>> It was rejected because if-cvt couldn't handle it correctly before >>>>>>>>>>>> but with >>>>>>>>>>>> this patch this is fixed. I see no reason to still reject this >>>>>>>>>>>> then even >>>>>>>>>>>> for !flag_force_vectorize. >>>>>>>>>>>> >>>>>>>>>>>> Rejecting PHIs with more than two arguments with >>>>>>>>>>>> flag_force_vectorize >>>>>>>>>>>> is ok. >>>>>>>>>>>> >>>>>>>>>>>> Richard. >>>>>>>>>>>> >>>>>>>>>>>>> Could you please clarify your statement. >>>>>>>>>>>>> >>>>>>>>>>>>> I attached modified patch. >>>>>>>>>>>>> >>>>>>>>>>>>> ChangeLog: >>>>>>>>>>>>> >>>>>>>>>>>>> 2014-10-17 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>> (add_to_dst_predicate_list): Conditionally invoke >>>>>>>>>>>>> add_to_predicate_list >>>>>>>>>>>>> if destination block of edge is not always executed. Set-up >>>>>>>>>>>>> predicate >>>>>>>>>>>>> for critical edge. >>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p >>>>>>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>>>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>>>>>>>> Add zeroing of edge 'aux' field. >>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>> phi-block >>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to >>>>>>>>>>>>> false. >>>>>>>>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2014-10-17 13:09 GMT+04:00 Richard Biener >>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev >>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here is reduced patch as you requested. All your remarks have >>>>>>>>>>>>>>> been fixed. >>>>>>>>>>>>>>> Could you please look at it ( I have already sent the patch with >>>>>>>>>>>>>>> changes in add_to_predicate_list for review). >>>>>>>>>>>>>> >>>>>>>>>>>>>> + if (dump_file && (dump_flags & TDF_DETAILS)) >>>>>>>>>>>>>> + fprintf (dump_file, "More than two phi node >>>>>>>>>>>>>> args.\n"); >>>>>>>>>>>>>> + return false; >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> + >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Excess vertical space. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> +/* Assumes that BB has more than 2 predecessors. >>>>>>>>>>>>>> >>>>>>>>>>>>>> More than 1 predecessor? >>>>>>>>>>>>>> >>>>>>>>>>>>>> + Returns false if at least one successor is not on critical >>>>>>>>>>>>>> edge >>>>>>>>>>>>>> + and true otherwise. */ >>>>>>>>>>>>>> + >>>>>>>>>>>>>> +static inline bool >>>>>>>>>>>>>> +all_edges_are_critical (basic_block bb) >>>>>>>>>>>>>> +{ >>>>>>>>>>>>>> >>>>>>>>>>>>>> "all_preds_critical_p" would be a better name >>>>>>>>>>>>>> >>>>>>>>>>>>>> + if (EDGE_COUNT (bb->preds) > 2) >>>>>>>>>>>>>> + { >>>>>>>>>>>>>> + if (!flag_force_vectorize) >>>>>>>>>>>>>> + return false; >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> >>>>>>>>>>>>>> as I said in the last review I don't think we should restrict >>>>>>>>>>>>>> edge >>>>>>>>>>>>>> predicates to flag_force_vectorize. At least I can't see how >>>>>>>>>>>>>> if-conversion is magically more expensive for that case? >>>>>>>>>>>>>> >>>>>>>>>>>>>> So please rework the patch so critical edges are always handled >>>>>>>>>>>>>> correctly. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok with that and the above suggested changes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>>> Yuri. >>>>>>>>>>>>>>> ChangeLog >>>>>>>>>>>>>>> 2014-10-16 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>>>> (add_to_dst_predicate_list): Conditionally invoke >>>>>>>>>>>>>>> add_to_predicate_list >>>>>>>>>>>>>>> if destination block of edge is not always executed. Set-up >>>>>>>>>>>>>>> predicate >>>>>>>>>>>>>>> for critical edge. >>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of >>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>> to reject block if-conversion with incoming critical edges only >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>>>>>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>>>>>>>>>> Add zeroing of edge 'aux' field. >>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>> original >>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to >>>>>>>>>>>>>>> false. >>>>>>>>>>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2014-10-15 13:50 GMT+04:00 Richard Biener >>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>> On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev >>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Here is updated patch (part1) for extended if conversion. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Second part of patch will be sent later. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ok, I'm starting to look at this. I'd still like you to split >>>>>>>>>>>>>>>> things up >>>>>>>>>>>>>>>> more. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> static inline void >>>>>>>>>>>>>>>> add_to_predicate_list (struct loop *loop, basic_block bb, >>>>>>>>>>>>>>>> tree nc) >>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + /* We use notion of cd equivalence to get simplier >>>>>>>>>>>>>>>> predicate for >>>>>>>>>>>>>>>> + join block, e.g. if join block has 2 predecessors >>>>>>>>>>>>>>>> with predicates >>>>>>>>>>>>>>>> + p1 & p2 and p1 & !p2, we'd like to get p1 for it >>>>>>>>>>>>>>>> instead of >>>>>>>>>>>>>>>> + p1 & p2 | p1 & !p2. */ >>>>>>>>>>>>>>>> + if (dom_bb != loop->header >>>>>>>>>>>>>>>> + && get_immediate_dominator (CDI_POST_DOMINATORS, >>>>>>>>>>>>>>>> dom_bb) == bb) >>>>>>>>>>>>>>>> + { >>>>>>>>>>>>>>>> + gcc_assert (flow_bb_inside_loop_p (loop, dom_bb)); >>>>>>>>>>>>>>>> + bc = bb_predicate (dom_bb); >>>>>>>>>>>>>>>> + gcc_assert (!is_true_predicate (bc)); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> these changes look worthwhile even for !flag_force_vectorize. >>>>>>>>>>>>>>>> So please >>>>>>>>>>>>>>>> split the change to add_to_predicate_list out and compute >>>>>>>>>>>>>>>> post-dominators >>>>>>>>>>>>>>>> unconditionally. Note that you should call free_dominance_info >>>>>>>>>>>>>>>> (CDI_POST_DOMINATORS) at the end of if-conversion. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest)) >>>>>>>>>>>>>>>> + add_to_predicate_list (loop, e->dest, cond); >>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>> + /* If edge E is critical save predicate on it. */ >>>>>>>>>>>>>>>> + if (EDGE_COUNT (e->dest->preds) >= 2) >>>>>>>>>>>>>>>> + set_edge_predicate (e, cond); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> how do we know the edge is critical by this simple check? Why >>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>> simply always save edge predicates (well, you kind of do but >>>>>>>>>>>>>>>> omit >>>>>>>>>>>>>>>> the case where e->src dominates e->dest). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Btw, you can rely on edge->aux being NULL at the start of the >>>>>>>>>>>>>>>> pass but need to clear it at the end (best use >>>>>>>>>>>>>>>> clear_aux_for_edges () >>>>>>>>>>>>>>>> for that). So stuff like >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + extract_true_false_edges_from_block (bb, &true_edge, >>>>>>>>>>>>>>>> &false_edge); >>>>>>>>>>>>>>>> + if (flag_force_vectorize) >>>>>>>>>>>>>>>> + true_edge->aux = false_edge->aux = NULL; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> shouldn't be necessary. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think the edge predicate handling should also be >>>>>>>>>>>>>>>> unconditionally >>>>>>>>>>>>>>>> and not depend on flag_force_vectorize. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + /* The loop latch and loop exit block are always >>>>>>>>>>>>>>>> executed and >>>>>>>>>>>>>>>> + have no extra conditions to be processed: skip them. >>>>>>>>>>>>>>>> */ >>>>>>>>>>>>>>>> + if (bb == loop->latch >>>>>>>>>>>>>>>> + || bb_with_exit_edge_p (loop, bb)) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't think the edge stuff is true - given you still only >>>>>>>>>>>>>>>> reset the >>>>>>>>>>>>>>>> loop->latch bb predicate the change looks broken. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + /* Fold_build2 can produce bool conversion which is >>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>> + supported by vectorizer, so re-build it without >>>>>>>>>>>>>>>> folding. >>>>>>>>>>>>>>>> + For example, such conversion is generated for >>>>>>>>>>>>>>>> sequence: >>>>>>>>>>>>>>>> + _Bool _7, _8, _9; >>>>>>>>>>>>>>>> + _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9; >>>>>>>>>>>>>>>> + if (_9 != 0) --> (bool)_9. */ >>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>> + if (CONVERT_EXPR_P (c) >>>>>>>>>>>>>>>> + && TREE_CODE_CLASS (code) == tcc_comparison) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think you should simply use canonicalize_cond_expr_cond on >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> folding result. Or rather _not_ fold at all - we are taking >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> operands from the GIMPLE condition unmodified after all. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - add_to_dst_predicate_list (loop, false_edge, >>>>>>>>>>>>>>>> - unshare_expr (cond), c2); >>>>>>>>>>>>>>>> + add_to_dst_predicate_list (loop, false_edge, >>>>>>>>>>>>>>>> unshare_expr (cond), >>>>>>>>>>>>>>>> + unshare_expr (c2)); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> why is it necessary to unshare c2? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please split out the PHI-with-multi-arg handling (I have not >>>>>>>>>>>>>>>> looked at >>>>>>>>>>>>>>>> that in detail). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Changelog. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2014-10-13 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect >>>>>>>>>>>>>>>>> function clone. >>>>>>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is >>>>>>>>>>>>>>>>> always >>>>>>>>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>>>>>>>>> for join blocks if it exists. >>>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>>>>>>>>> destination block of edge is not always executed. Set-up >>>>>>>>>>>>>>>>> predicate >>>>>>>>>>>>>>>>> for critical edge. >>>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two >>>>>>>>>>>>>>>>> args >>>>>>>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two >>>>>>>>>>>>>>>>> predecessors if >>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of >>>>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>>>> to reject block if-conversion with incoming critical edges >>>>>>>>>>>>>>>>> only if >>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate >>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent >>>>>>>>>>>>>>>>> bb's. >>>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals >>>>>>>>>>>>>>>>> that extended predication must be applied). >>>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated >>>>>>>>>>>>>>>>> basic >>>>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. >>>>>>>>>>>>>>>>> Insert >>>>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>>>>> current >>>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>>>>>>> versioning >>>>>>>>>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' >>>>>>>>>>>>>>>>> field of edges >>>>>>>>>>>>>>>>> for blocks with two successors. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev >>>>>>>>>>>>>>>>> <ysrum...@gmail.com>: >>>>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> here is reduced patch (part.1) which was reduced almost >>>>>>>>>>>>>>>>>> twice. >>>>>>>>>>>>>>>>>> Let's me also answer on your comments. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. I really use edge field 'aux' to keep predicate for >>>>>>>>>>>>>>>>>> critical edges. >>>>>>>>>>>>>>>>>> My previous code was not correct and now it looks like: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT >>>>>>>>>>>>>>>>>> (e->dest->preds) == 1) >>>>>>>>>>>>>>>>>> /* Edge E is not critical, use predicate of edge source >>>>>>>>>>>>>>>>>> bb. */ >>>>>>>>>>>>>>>>>> c = bb_predicate (b); >>>>>>>>>>>>>>>>>> else >>>>>>>>>>>>>>>>>> /* Edge E is critical and its aux field contains >>>>>>>>>>>>>>>>>> predicate. */ >>>>>>>>>>>>>>>>>> c = edge_predicate (e); >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. I completely delete all code related to creation of >>>>>>>>>>>>>>>>>> conditional >>>>>>>>>>>>>>>>>> expressions and completely rely on bool pattern recognition >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> vectorizer. But we need to delete all dead predicate >>>>>>>>>>>>>>>>>> computations >>>>>>>>>>>>>>>>>> which are not used since they prevent vectorization. I will >>>>>>>>>>>>>>>>>> add this >>>>>>>>>>>>>>>>>> local-dce function in next patch. >>>>>>>>>>>>>>>>>> 3. I also did not include in this patch recognition of >>>>>>>>>>>>>>>>>> general >>>>>>>>>>>>>>>>>> phi-nodes with two arguments only for which conversion of >>>>>>>>>>>>>>>>>> conditional >>>>>>>>>>>>>>>>>> scalar reduction can be applied also. >>>>>>>>>>>>>>>>>> Note that all these changes are applied for loop marked with >>>>>>>>>>>>>>>>>> pragma >>>>>>>>>>>>>>>>>> omp simd only. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2014-09-22 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect >>>>>>>>>>>>>>>>>> function clone. >>>>>>>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is >>>>>>>>>>>>>>>>>> always >>>>>>>>>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>>>>>>>>>> for join blocks if it exists. >>>>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>>>>>>>>>> destination block of edge is not always executed. Set-up >>>>>>>>>>>>>>>>>> predicate >>>>>>>>>>>>>>>>>> for critical edge. >>>>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two >>>>>>>>>>>>>>>>>> args >>>>>>>>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two >>>>>>>>>>>>>>>>>> predecessors if >>>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of >>>>>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>>>>> to reject block if-conversion with incoming critical edges >>>>>>>>>>>>>>>>>> only if >>>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate >>>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent >>>>>>>>>>>>>>>>>> bb's. >>>>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it >>>>>>>>>>>>>>>>>> signals >>>>>>>>>>>>>>>>>> that extended predication must be applied). >>>>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated >>>>>>>>>>>>>>>>>> basic >>>>>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. >>>>>>>>>>>>>>>>>> Insert >>>>>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>>>>>> current >>>>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>>>>>>>> versioning >>>>>>>>>>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' >>>>>>>>>>>>>>>>>> field of edges >>>>>>>>>>>>>>>>>> for blocks with two successors. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2014-09-08 17:10 GMT+04:00 Richard Biener >>>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev >>>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> Richard! >>>>>>>>>>>>>>>>>>>> Here is updated patch with the following changes: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 1. Any restrictions on phi-function were eliminated for >>>>>>>>>>>>>>>>>>>> extended conversion. >>>>>>>>>>>>>>>>>>>> 2. Put predicate for critical edges to 'aux' field of >>>>>>>>>>>>>>>>>>>> edge, i.e. >>>>>>>>>>>>>>>>>>>> negate_predicate was deleted. >>>>>>>>>>>>>>>>>>>> 3. Deleted splitting of critical edges, i.e. both outgoing >>>>>>>>>>>>>>>>>>>> edges can >>>>>>>>>>>>>>>>>>>> be critical. >>>>>>>>>>>>>>>>>>>> 4. Use notion of cd-equivalence to set-up predicate for >>>>>>>>>>>>>>>>>>>> join basic >>>>>>>>>>>>>>>>>>>> blocks to simplify it. >>>>>>>>>>>>>>>>>>>> 5. I decided to not design pre-pass since it will lead >>>>>>>>>>>>>>>>>>>> generating >>>>>>>>>>>>>>>>>>>> chain of cond expressions for phi-node if conversion, >>>>>>>>>>>>>>>>>>>> whereas for phi >>>>>>>>>>>>>>>>>>>> of kind >>>>>>>>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>>>>>>>>> only one cond expression is required and this is >>>>>>>>>>>>>>>>>>>> considered as simple >>>>>>>>>>>>>>>>>>>> optimization for arbitrary phi-function. More precise, >>>>>>>>>>>>>>>>>>>> if phi-function have only two different arguments and one >>>>>>>>>>>>>>>>>>>> of them has >>>>>>>>>>>>>>>>>>>> single occurrence, if- conversion is performed as if phi >>>>>>>>>>>>>>>>>>>> have only 2 >>>>>>>>>>>>>>>>>>>> arguments. >>>>>>>>>>>>>>>>>>>> For arbitrary phi function a chain of cond expressions is >>>>>>>>>>>>>>>>>>>> produced. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Updated patch is attached. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Any comments will be appreciated. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The patch is still very big and does multiple things at >>>>>>>>>>>>>>>>>>> once which makes >>>>>>>>>>>>>>>>>>> it hard to review. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In addition to that it changes function singatures without >>>>>>>>>>>>>>>>>>> updating >>>>>>>>>>>>>>>>>>> the function comments. For example what is the convert_bool >>>>>>>>>>>>>>>>>>> argument doing to add_to_dst_predicate_list? Why do we need >>>>>>>>>>>>>>>>>>> all this added logic. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> You duplicate operand_equal_for_phi_arg_p. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I think the code handling PHIs with more than two operands >>>>>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>> only two unequal operands is useful generally, so that's an >>>>>>>>>>>>>>>>>>> obvious >>>>>>>>>>>>>>>>>>> candidate for splitting out into a separate patch. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> + CONVERT_BOOL argument was added to convert bool >>>>>>>>>>>>>>>>>>> predicate computations >>>>>>>>>>>>>>>>>>> + which is not supported by vectorizer to int type >>>>>>>>>>>>>>>>>>> through creating of >>>>>>>>>>>>>>>>>>> + conditional expressions. */ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Example? The vectorizer has patterns for bool predicate >>>>>>>>>>>>>>>>>>> computations. >>>>>>>>>>>>>>>>>>> This seems to be another feature that needs splitting out. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The way you get around the critical edge parts looks >>>>>>>>>>>>>>>>>>> awkward to me. >>>>>>>>>>>>>>>>>>> Please either do _all_ predicates as edge predicates or >>>>>>>>>>>>>>>>>>> simply >>>>>>>>>>>>>>>>>>> split critical edges (of the respective loop body). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I still think that an utility doing same PHI arg merging by >>>>>>>>>>>>>>>>>>> introducing >>>>>>>>>>>>>>>>>>> forwarder blocks would be nicer to have. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'd restructure the main tree_if_conversion function to >>>>>>>>>>>>>>>>>>> apply these >>>>>>>>>>>>>>>>>>> CFG pre-transforms when we are going to version the loop >>>>>>>>>>>>>>>>>>> for if conversion (eventually transitioning to always doing >>>>>>>>>>>>>>>>>>> that). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So - please split up the patch. It's way too big. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2014-08-15 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect >>>>>>>>>>>>>>>>>>>> function clone. >>>>>>>>>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function. >>>>>>>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of >>>>>>>>>>>>>>>>>>>> negate_predicate field. >>>>>>>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE. >>>>>>>>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>>>>>> Use predicate of cd-equivalent block if convert_bool is >>>>>>>>>>>>>>>>>>>> true and >>>>>>>>>>>>>>>>>>>> such bb exists; save it in static variable for further >>>>>>>>>>>>>>>>>>>> possible use. >>>>>>>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument >>>>>>>>>>>>>>>>>>>> is true. >>>>>>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>>>>>> Add early function exit if edge target block is always >>>>>>>>>>>>>>>>>>>> executed. >>>>>>>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument >>>>>>>>>>>>>>>>>>>> is true. >>>>>>>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>>>>>>>>> Set-up predicate for crritical edge if convert_bool is >>>>>>>>>>>>>>>>>>>> true. >>>>>>>>>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than >>>>>>>>>>>>>>>>>>>> two args >>>>>>>>>>>>>>>>>>>> if flag_force_vectorize wa set-up. >>>>>>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Add test on >>>>>>>>>>>>>>>>>>>> flag_force_vectorize. >>>>>>>>>>>>>>>>>>>> (if_convertible_stmt_p): Allow calls of function clones if >>>>>>>>>>>>>>>>>>>> flag_force_vectorize was set-up. >>>>>>>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two >>>>>>>>>>>>>>>>>>>> predecessors if >>>>>>>>>>>>>>>>>>>> flag_force_vectorize was set-up. Use call of >>>>>>>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges >>>>>>>>>>>>>>>>>>>> only if >>>>>>>>>>>>>>>>>>>> flag_force_vectorize was not set-up. >>>>>>>>>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument which is used >>>>>>>>>>>>>>>>>>>> to transform >>>>>>>>>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>>>>>>>>> expressions >>>>>>>>>>>>>>>>>>>> with integral operands. If convert_bool argument was >>>>>>>>>>>>>>>>>>>> set-up and >>>>>>>>>>>>>>>>>>>> vect bool pattern can be appied perform the following >>>>>>>>>>>>>>>>>>>> transformation: >>>>>>>>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>>>>>>>>> Add check that if fold_build2 produces bool conversion if >>>>>>>>>>>>>>>>>>>> convert_bool >>>>>>>>>>>>>>>>>>>> was set-up, recompute predicate using build2_loc. >>>>>>>>>>>>>>>>>>>> Additional argument >>>>>>>>>>>>>>>>>>>> 'convert_bool" is passed to add_to_dst_predicate_list and >>>>>>>>>>>>>>>>>>>> add_to_predicate_list. >>>>>>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>>>>>>>>>> flag_force_vectorize was set-up to calculate cd equivalent >>>>>>>>>>>>>>>>>>>> bb's. >>>>>>>>>>>>>>>>>>>> Call predicate_bbs with additional argument equal to false. >>>>>>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function >>>>>>>>>>>>>>>>>>>> interface: >>>>>>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means >>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors >>>>>>>>>>>>>>>>>>>> of phi-block >>>>>>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which >>>>>>>>>>>>>>>>>>>> signals that >>>>>>>>>>>>>>>>>>>> phi arguments must be evaluated through >>>>>>>>>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of >>>>>>>>>>>>>>>>>>>> convert_name_to_cmp if cond >>>>>>>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>>>>>>> (predicate_arbitrary_phi): New function. >>>>>>>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple >>>>>>>>>>>>>>>>>>>> statement >>>>>>>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for >>>>>>>>>>>>>>>>>>>> insertion. >>>>>>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for >>>>>>>>>>>>>>>>>>>> non-predicated basic >>>>>>>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. >>>>>>>>>>>>>>>>>>>> Insert >>>>>>>>>>>>>>>>>>>> predicates at the block begining for extended >>>>>>>>>>>>>>>>>>>> if-conversion. >>>>>>>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for >>>>>>>>>>>>>>>>>>>> extended >>>>>>>>>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to >>>>>>>>>>>>>>>>>>>> predicate_bbs. >>>>>>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>>>>>>>> current >>>>>>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>>>>>>>>>> versioning >>>>>>>>>>>>>>>>>>>> for innermost loop marked with pragma omp simd. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener >>>>>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev >>>>>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We implemented additional support for pragma omp simd in >>>>>>>>>>>>>>>>>>>>>> part of >>>>>>>>>>>>>>>>>>>>>> extended if-conversion loops with such pragma. These >>>>>>>>>>>>>>>>>>>>>> extensions >>>>>>>>>>>>>>>>>>>>>> include: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. All extensions are performed only if considered loop >>>>>>>>>>>>>>>>>>>>>> or its outer >>>>>>>>>>>>>>>>>>>>>> loop was marked with pragma omp simd >>>>>>>>>>>>>>>>>>>>>> (force_vectorize); For ordinary >>>>>>>>>>>>>>>>>>>>>> loops behavior was not changed. >>>>>>>>>>>>>>>>>>>>>> 2. Took off cfg restriction on basic block which can >>>>>>>>>>>>>>>>>>>>>> have more than 2 >>>>>>>>>>>>>>>>>>>>>> predecessors. >>>>>>>>>>>>>>>>>>>>>> 3. Put additional restriction on phi nodes which was >>>>>>>>>>>>>>>>>>>>>> missed in current design: >>>>>>>>>>>>>>>>>>>>>> all phi nodes must be in non-predicated basic block >>>>>>>>>>>>>>>>>>>>>> to conform >>>>>>>>>>>>>>>>>>>>>> semantic of COND_EXPR which is used for >>>>>>>>>>>>>>>>>>>>>> transformation. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> How is that so? If the PHI is predicated then its result >>>>>>>>>>>>>>>>>>>>> will be used >>>>>>>>>>>>>>>>>>>>> in a PHI node again and thus we'd create a sequence of >>>>>>>>>>>>>>>>>>>>> COND_EXPRs. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> No? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 4. Extend predication of phi nodes: phi may have more >>>>>>>>>>>>>>>>>>>>>> than 2 arguments >>>>>>>>>>>>>>>>>>>>>> with some limitations: >>>>>>>>>>>>>>>>>>>>>> - for phi nodes which have more than 2 arguments, but >>>>>>>>>>>>>>>>>>>>>> only two >>>>>>>>>>>>>>>>>>>>>> arguments are different and one of them has the only >>>>>>>>>>>>>>>>>>>>>> occurence, >>>>>>>>>>>>>>>>>>>>>> transformation to single COND_EXPR can be done. >>>>>>>>>>>>>>>>>>>>>> - if phi node has more different arguments and all >>>>>>>>>>>>>>>>>>>>>> edge predicates >>>>>>>>>>>>>>>>>>>>>> correspondent to phi-arguments are disjoint, a chain >>>>>>>>>>>>>>>>>>>>>> of COND_EXPR >>>>>>>>>>>>>>>>>>>>>> will be generated for it. In current design very >>>>>>>>>>>>>>>>>>>>>> simple check is used: >>>>>>>>>>>>>>>>>>>>>> check starting from end that two edges correspondent >>>>>>>>>>>>>>>>>>>>>> to neighbor >>>>>>>>>>>>>>>>>>>>>> arguments have common predecessor which is used for >>>>>>>>>>>>>>>>>>>>>> further check >>>>>>>>>>>>>>>>>>>>>> with next edge. >>>>>>>>>>>>>>>>>>>>>> These guarantee that phi predication will produce the >>>>>>>>>>>>>>>>>>>>>> correct result. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Btw, you can think of these extensions as unfactoring a >>>>>>>>>>>>>>>>>>>>> PHI node by >>>>>>>>>>>>>>>>>>>>> inserting forwarder blocks. Thus >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> becomes >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> bb 5: <forwarder-from(2)-and(3)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> x = PHI <1(5), 2(4)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> x = PHI <1(2), 2(3), 3(4)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> becomes >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> bb 5: >>>>>>>>>>>>>>>>>>>>> x' = PHI <1(2), 2(3)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> b = PHI<x'(5), 3(4)> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> which means that 3) has to work. Note that we want this >>>>>>>>>>>>>>>>>>>>> kind of >>>>>>>>>>>>>>>>>>>>> PHI transforms for out-of-SSA as well to reduce the >>>>>>>>>>>>>>>>>>>>> number of >>>>>>>>>>>>>>>>>>>>> copies we need to insert on edges. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thus it would be nice if you implemented 4) in terms of a >>>>>>>>>>>>>>>>>>>>> pre-pass >>>>>>>>>>>>>>>>>>>>> over the force_vect loops PHI nodes, applying that CFG >>>>>>>>>>>>>>>>>>>>> transform. >>>>>>>>>>>>>>>>>>>>> And make 3) work properly if it doesn't already. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> It looks like you introduce a "negate predicate" to work >>>>>>>>>>>>>>>>>>>>> around the >>>>>>>>>>>>>>>>>>>>> critical edge limitation? Please instead change >>>>>>>>>>>>>>>>>>>>> if-conversion to >>>>>>>>>>>>>>>>>>>>> work with edge predicates (as opposed to BB predicates). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Here is example of such extended predication (compile >>>>>>>>>>>>>>>>>>>>>> with -march=core-avx2): >>>>>>>>>>>>>>>>>>>>>> #pragma omp simd safelen(8) >>>>>>>>>>>>>>>>>>>>>> for (i=0; i<512; i++) >>>>>>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>> float t = a[i]; >>>>>>>>>>>>>>>>>>>>>> if (t > 0 & t < 1.0e+17f) >>>>>>>>>>>>>>>>>>>>>> if (c[i] != 0) >>>>>>>>>>>>>>>>>>>>>> res += 1; >>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>> <bb 4>: >>>>>>>>>>>>>>>>>>>>>> # res_15 = PHI <res_1(5), 0(3)> >>>>>>>>>>>>>>>>>>>>>> # i_16 = PHI <i_11(5), 0(3)> >>>>>>>>>>>>>>>>>>>>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)> >>>>>>>>>>>>>>>>>>>>>> t_5 = a[i_16]; >>>>>>>>>>>>>>>>>>>>>> _6 = t_5 > 0.0; >>>>>>>>>>>>>>>>>>>>>> _7 = t_5 < 9.9999998430674944e+16; >>>>>>>>>>>>>>>>>>>>>> _8 = _7 & _6; >>>>>>>>>>>>>>>>>>>>>> _ifc__28 = (unsigned int) _8; >>>>>>>>>>>>>>>>>>>>>> _10 = &c[i_16]; >>>>>>>>>>>>>>>>>>>>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0; >>>>>>>>>>>>>>>>>>>>>> _9 = MASK_LOAD (_10, 0B, _ifc__36); >>>>>>>>>>>>>>>>>>>>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>>>>>>>>> _ifc__30 = (int) _ifc__29; >>>>>>>>>>>>>>>>>>>>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0; >>>>>>>>>>>>>>>>>>>>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>>>>>>>>> _ifc__33 = (int) _ifc__32; >>>>>>>>>>>>>>>>>>>>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0; >>>>>>>>>>>>>>>>>>>>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0; >>>>>>>>>>>>>>>>>>>>>> res_1 = res_15 + _ifc__35; >>>>>>>>>>>>>>>>>>>>>> i_11 = i_16 + 1; >>>>>>>>>>>>>>>>>>>>>> ivtmp_14 = ivtmp_17 - 1; >>>>>>>>>>>>>>>>>>>>>> if (ivtmp_14 != 0) >>>>>>>>>>>>>>>>>>>>>> goto <bb 4>; >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new >>>>>>>>>>>>>>>>>>>>>> failures. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> gcc/ChageLog >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2014-06-25 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable. >>>>>>>>>>>>>>>>>>>>>> (struct bb_predicate_s): Add negate_predicate field. >>>>>>>>>>>>>>>>>>>>>> (bb_negate_predicate): New function. >>>>>>>>>>>>>>>>>>>>>> (set_bb_negate_predicate): New function. >>>>>>>>>>>>>>>>>>>>>> (bb_copy_predicate): New function. >>>>>>>>>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New >>>>>>>>>>>>>>>>>>>>>> function. >>>>>>>>>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of >>>>>>>>>>>>>>>>>>>>>> negate_predicate field. >>>>>>>>>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to >>>>>>>>>>>>>>>>>>>>>> NULL_TREE. >>>>>>>>>>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool >>>>>>>>>>>>>>>>>>>>>> argument is true. >>>>>>>>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>>>>>>>> Add early function exit if edge target block is always >>>>>>>>>>>>>>>>>>>>>> executed. >>>>>>>>>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool >>>>>>>>>>>>>>>>>>>>>> argument is true. >>>>>>>>>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>>>>>>>>>>> (phi_args_disjoint): New function. >>>>>>>>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than >>>>>>>>>>>>>>>>>>>>>> two args >>>>>>>>>>>>>>>>>>>>>> for loops marked with pragma omp simd. Add check that >>>>>>>>>>>>>>>>>>>>>> phi nodes are >>>>>>>>>>>>>>>>>>>>>> in non-predicated basic blocks. >>>>>>>>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use >>>>>>>>>>>>>>>>>>>>>> flag_force_vectorize. >>>>>>>>>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two >>>>>>>>>>>>>>>>>>>>>> predecessors if >>>>>>>>>>>>>>>>>>>>>> flag_force_vectorize was setup. Use call of >>>>>>>>>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical >>>>>>>>>>>>>>>>>>>>>> edges only if >>>>>>>>>>>>>>>>>>>>>> flag_force_vectorize was not setup. >>>>>>>>>>>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument that is used >>>>>>>>>>>>>>>>>>>>>> to transform >>>>>>>>>>>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>>>>>>>>>>> expressions >>>>>>>>>>>>>>>>>>>>>> with integral operands. If bool_conv argument is false >>>>>>>>>>>>>>>>>>>>>> or both >>>>>>>>>>>>>>>>>>>>>> outgoing edges are not critical old algorithm of >>>>>>>>>>>>>>>>>>>>>> predicate assignments >>>>>>>>>>>>>>>>>>>>>> is used, otherwise the following code was added: check >>>>>>>>>>>>>>>>>>>>>> on applicable >>>>>>>>>>>>>>>>>>>>>> of vect-bool-pattern recognition and trnasformation of >>>>>>>>>>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>>>>>>>>>>> compute predicates for both outgoing edges one of which >>>>>>>>>>>>>>>>>>>>>> is critical >>>>>>>>>>>>>>>>>>>>>> one using 'normal' edge, i.e. compute true and false >>>>>>>>>>>>>>>>>>>>>> predicates using >>>>>>>>>>>>>>>>>>>>>> normal outgoing edge only; evaluated predicates are >>>>>>>>>>>>>>>>>>>>>> stored in >>>>>>>>>>>>>>>>>>>>>> predicate and negate_predicate fields of struct >>>>>>>>>>>>>>>>>>>>>> bb_predicate_s and >>>>>>>>>>>>>>>>>>>>>> negate_predicate of normal edge conatins predicate of >>>>>>>>>>>>>>>>>>>>>> critical edge, >>>>>>>>>>>>>>>>>>>>>> but generated gimplified statements are stored in their >>>>>>>>>>>>>>>>>>>>>> destination >>>>>>>>>>>>>>>>>>>>>> block fields. Additional argument 'convert_bool" is >>>>>>>>>>>>>>>>>>>>>> passed to >>>>>>>>>>>>>>>>>>>>>> add_to_dst_predicate_list and add_to_predicate_list. >>>>>>>>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with >>>>>>>>>>>>>>>>>>>>>> additional argument >>>>>>>>>>>>>>>>>>>>>> equal to false. >>>>>>>>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function >>>>>>>>>>>>>>>>>>>>>> interface: >>>>>>>>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by >>>>>>>>>>>>>>>>>>>>>> means of >>>>>>>>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors >>>>>>>>>>>>>>>>>>>>>> of phi-block >>>>>>>>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument >>>>>>>>>>>>>>>>>>>>>> which signals that >>>>>>>>>>>>>>>>>>>>>> both phi arguments must be evaluated through >>>>>>>>>>>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of >>>>>>>>>>>>>>>>>>>>>> convert_name_to_cmp if cond >>>>>>>>>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_phi_disjoint_args): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple >>>>>>>>>>>>>>>>>>>>>> statement >>>>>>>>>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for >>>>>>>>>>>>>>>>>>>>>> insertion. >>>>>>>>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for >>>>>>>>>>>>>>>>>>>>>> non-predicated basic >>>>>>>>>>>>>>>>>>>>>> blocks that there are no gimplified statements to >>>>>>>>>>>>>>>>>>>>>> insert. Insert >>>>>>>>>>>>>>>>>>>>>> predicates at the block begining for extended >>>>>>>>>>>>>>>>>>>>>> if-conversion. >>>>>>>>>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for >>>>>>>>>>>>>>>>>>>>>> extended >>>>>>>>>>>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to >>>>>>>>>>>>>>>>>>>>>> predicate_bbs. >>>>>>>>>>>>>>>>>>>>>> (split_crit_edge): New function. >>>>>>>>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize >>>>>>>>>>>>>>>>>>>>>> from current >>>>>>>>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare). >>>>>>>>>>>>>>>>>>>>>> Invoke >>>>>>>>>>>>>>>>>>>>>> split_crit_edge for extended predication. Do loop >>>>>>>>>>>>>>>>>>>>>> versioning for >>>>>>>>>>>>>>>>>>>>>> innermost loop marked with pragma omp simd.