Richard,

Yes, This patch does not make sense since phi node predication for bb
with critical incoming edges only performs another function which is
absent (predicate_extended_scalar_phi).

BTW I see that commit_edge_insertions() is used for rtx instructions
only but you propose to use it for tree also.
Did I miss something?

Thanks ahead.


2014-10-21 16:44 GMT+04:00 Richard Biener <richard.guent...@gmail.com>:
> On Tue, Oct 21, 2014 at 2:25 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Richard,
>>
>> I did some changes in patch and ChangeLog to mark that support for
>> if-convert of blocks with only critical incoming edges will be added
>> in the future (more precise in patch.4).
>
> But the same reasoning applies to this version of the patch when
> flag_force_vectorize is true!?  (insertion point and invalid SSA form)
>
> Which means the patch doesn't make sense in isolation?
>
> Btw, I think for the case you should simply do gsi_insert_on_edge ()
> and commit_edge_insertions () before the call to combine_blocks
> (pushing the edge predicate to the newly created block).
>
> Richard.
>
>> Could you please review it.
>>
>> Thanks.
>>
>> ChangeLog:
>>
>> 2014-10-21  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> (flag_force_vectorize): New variable.
>> (edge_predicate): New function.
>> (set_edge_predicate): New function.
>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>> if destination block of edge is not always executed. Set-up predicate
>> for critical edge.
>> (if_convertible_phi_p): Accept phi nodes with more than two args
>> if FLAG_FORCE_VECTORIZE was set-up.
>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>> (if_convertible_stmt_p): Fix up pre-function comments.
>> (all_preds_critical_p): New function.
>> (if_convertible_bb_p): Use call of all_preds_critical_p
>> to reject temporarily block if-conversion with incoming critical edges
>> if FLAG_FORCE_VECTORIZE was not set-up. This restriction will be deleted
>> after adding support for extended predication.
>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>> to compute predicate instead of fold_build2_loc.
>> Add zeroing of edge 'aux' field.
>> (find_phi_replacement_condition): Extend function interface:
>> it returns NULL if given phi node must be handled by means of
>> extended phi node predication. If number of predecessors of phi-block
>> is equal 2 and at least one incoming edge is not critical original
>> algorithm is used.
>> (tree_if_conversion): Temporarily set-up FLAG_FORCE_VECTORIZE to false.
>> Nullify 'aux' field of edges for blocks with two successors.
>>
>> 2014-10-20 17:55 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>:
>>> Richard,
>>>
>>> Thanks for your answer!
>>>
>>> In current implementation phi node conversion assume that one of
>>> incoming edge to bb containing given phi has at least one non-critical
>>> edge and choose it to insert predicated code. But if we choose
>>> critical edge we need to determine insert point and insertion
>>> direction (before/after) since in other case we can get invalid ssa
>>> form (use before def). This is done by my new function which is not in
>>> current patch ( I will present this patch later). SO I assume that we
>>> need to leave this patch as it is to not introduce new bugs.
>>>
>>> Thanks.
>>> Yuri.
>>>
>>> 2014-10-20 12:00 GMT+04:00 Richard Biener <richard.guent...@gmail.com>:
>>>> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>> wrote:
>>>>> Richard,
>>>>>
>>>>> I reworked the patch as you proposed, but I didn't understand what
>>>>> did you mean by:
>>>>>
>>>>>>So please rework the patch so critical edges are always handled
>>>>>>correctly.
>>>>>
>>>>> In current patch flag_force_vectorize is used (1) to reject phi nodes
>>>>> with more than 2 arguments; (2) to reject basic blocks with only
>>>>> critical incoming edges since support for extended predication of phi
>>>>> nodes will be in next patch.
>>>>
>>>> I mean that (2) should not be rejected dependent on flag_force_vectorize.
>>>> It was rejected because if-cvt couldn't handle it correctly before but with
>>>> this patch this is fixed.  I see no reason to still reject this then even
>>>> for !flag_force_vectorize.
>>>>
>>>> Rejecting PHIs with more than two arguments with flag_force_vectorize
>>>> is ok.
>>>>
>>>> Richard.
>>>>
>>>>> Could you please clarify your statement.
>>>>>
>>>>> I attached modified patch.
>>>>>
>>>>> ChangeLog:
>>>>>
>>>>> 2014-10-17  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>
>>>>> (flag_force_vectorize): New variable.
>>>>> (edge_predicate): New function.
>>>>> (set_edge_predicate): New function.
>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>>>> if destination block of edge is not always executed. Set-up predicate
>>>>> for critical edge.
>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>> (all_edges_are_critical): New function.
>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p
>>>>> to reject block if-conversion with incoming critical edges only if
>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>>>> to compute predicate instead of fold_build2_loc.
>>>>> Add zeroing of edge 'aux' field.
>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>> it returns NULL if given phi node must be handled by means of
>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>> algorithm is used.
>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
>>>>> Nullify 'aux' field of edges for blocks with two successors.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-10-17 13:09 GMT+04:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> Here is reduced patch as you requested. All your remarks have been 
>>>>>>> fixed.
>>>>>>> Could you please look at it ( I have already sent the patch with
>>>>>>> changes in add_to_predicate_list for review).
>>>>>>
>>>>>> +             if (dump_file && (dump_flags & TDF_DETAILS))
>>>>>> +               fprintf (dump_file, "More than two phi node args.\n");
>>>>>> +             return false;
>>>>>> +           }
>>>>>> +
>>>>>> +        }
>>>>>>
>>>>>> Excess vertical space.
>>>>>>
>>>>>>
>>>>>> +/* Assumes that BB has more than 2 predecessors.
>>>>>>
>>>>>> More than 1 predecessor?
>>>>>>
>>>>>> +   Returns false if at least one successor is not on critical edge
>>>>>> +   and true otherwise.  */
>>>>>> +
>>>>>> +static inline bool
>>>>>> +all_edges_are_critical (basic_block bb)
>>>>>> +{
>>>>>>
>>>>>> "all_preds_critical_p" would be a better name
>>>>>>
>>>>>> +  if (EDGE_COUNT (bb->preds) > 2)
>>>>>> +    {
>>>>>> +      if (!flag_force_vectorize)
>>>>>> +       return false;
>>>>>> +    }
>>>>>>
>>>>>> as I said in the last review I don't think we should restrict edge
>>>>>> predicates to flag_force_vectorize.  At least I can't see how
>>>>>> if-conversion is magically more expensive for that case?
>>>>>>
>>>>>> So please rework the patch so critical edges are always handled
>>>>>> correctly.
>>>>>>
>>>>>> Ok with that and the above suggested changes.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>
>>>>>>> Thanks.
>>>>>>> Yuri.
>>>>>>> ChangeLog
>>>>>>> 2014-10-16  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>
>>>>>>> (flag_force_vectorize): New variable.
>>>>>>> (edge_predicate): New function.
>>>>>>> (set_edge_predicate): New function.
>>>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>>>>>> if destination block of edge is not always executed. Set-up predicate
>>>>>>> for critical edge.
>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>> (all_edges_are_critical): New function.
>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>>>>>> to compute predicate instead of fold_build2_loc.
>>>>>>> Add zeroing of edge 'aux' field.
>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>> algorithm is used.
>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
>>>>>>> Nullify 'aux' field of edges for blocks with two successors.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2014-10-15 13:50 GMT+04:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>>>> On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>> Richard,
>>>>>>>>>
>>>>>>>>> Here is updated patch (part1) for extended if conversion.
>>>>>>>>>
>>>>>>>>> Second part of patch will be sent later.
>>>>>>>>
>>>>>>>> Ok, I'm starting to look at this.  I'd still like you to split things 
>>>>>>>> up
>>>>>>>> more.
>>>>>>>>
>>>>>>>>  static inline void
>>>>>>>>  add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
>>>>>>>>  {
>>>>>>>> ...
>>>>>>>>
>>>>>>>> +      /* We use notion of cd equivalence to get simplier predicate for
>>>>>>>> +        join block, e.g. if join block has 2 predecessors with 
>>>>>>>> predicates
>>>>>>>> +        p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of
>>>>>>>> +        p1 & p2 | p1 & !p2.  */
>>>>>>>> +      if (dom_bb != loop->header
>>>>>>>> +         && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == 
>>>>>>>> bb)
>>>>>>>> +       {
>>>>>>>> +         gcc_assert (flow_bb_inside_loop_p (loop, dom_bb));
>>>>>>>> +         bc = bb_predicate (dom_bb);
>>>>>>>> +         gcc_assert (!is_true_predicate (bc));
>>>>>>>>
>>>>>>>> these changes look worthwhile even for !flag_force_vectorize.  So 
>>>>>>>> please
>>>>>>>> split the change to add_to_predicate_list out and compute 
>>>>>>>> post-dominators
>>>>>>>> unconditionally.  Note that you should call free_dominance_info
>>>>>>>> (CDI_POST_DOMINATORS) at the end of if-conversion.
>>>>>>>>
>>>>>>>> +  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest))
>>>>>>>> +    add_to_predicate_list (loop, e->dest, cond);
>>>>>>>> +
>>>>>>>> +  /* If edge E is critical save predicate on it.  */
>>>>>>>> +  if (EDGE_COUNT (e->dest->preds) >= 2)
>>>>>>>> +    set_edge_predicate (e, cond);
>>>>>>>>
>>>>>>>> how do we know the edge is critical by this simple check?  Why not
>>>>>>>> simply always save edge predicates (well, you kind of do but omit
>>>>>>>> the case where e->src dominates e->dest).
>>>>>>>>
>>>>>>>> Btw, you can rely on edge->aux being NULL at the start of the
>>>>>>>> pass but need to clear it at the end (best use clear_aux_for_edges ()
>>>>>>>> for that).  So stuff like
>>>>>>>>
>>>>>>>> +         extract_true_false_edges_from_block (bb, &true_edge, 
>>>>>>>> &false_edge);
>>>>>>>> +         if (flag_force_vectorize)
>>>>>>>> +           true_edge->aux = false_edge->aux = NULL;
>>>>>>>>
>>>>>>>> shouldn't be necessary.
>>>>>>>>
>>>>>>>> I think the edge predicate handling should also be unconditionally
>>>>>>>> and not depend on flag_force_vectorize.
>>>>>>>>
>>>>>>>> +      /* The loop latch and loop exit block are always executed and
>>>>>>>> +        have no extra conditions to be processed: skip them.  */
>>>>>>>> +      if (bb == loop->latch
>>>>>>>> +         || bb_with_exit_edge_p (loop, bb))
>>>>>>>>
>>>>>>>> I don't think the edge stuff is true - given you still only reset the
>>>>>>>> loop->latch bb predicate the change looks broken.
>>>>>>>>
>>>>>>>> +         /* Fold_build2 can produce bool conversion which is not
>>>>>>>> +             supported by vectorizer, so re-build it without folding.
>>>>>>>> +            For example, such conversion is generated for sequence:
>>>>>>>> +               _Bool _7, _8, _9;
>>>>>>>> +               _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9;
>>>>>>>> +               if (_9 != 0)  --> (bool)_9.  */
>>>>>>>> +
>>>>>>>> +         if (CONVERT_EXPR_P (c)
>>>>>>>> +             && TREE_CODE_CLASS (code) == tcc_comparison)
>>>>>>>>
>>>>>>>> I think you should simply use canonicalize_cond_expr_cond on the
>>>>>>>> folding result.  Or rather _not_ fold at all - we are taking the
>>>>>>>> operands from the GIMPLE condition unmodified after all.
>>>>>>>>
>>>>>>>> -         add_to_dst_predicate_list (loop, false_edge,
>>>>>>>> -                                    unshare_expr (cond), c2);
>>>>>>>> +         add_to_dst_predicate_list (loop, false_edge, unshare_expr 
>>>>>>>> (cond),
>>>>>>>> +                                    unshare_expr (c2));
>>>>>>>>
>>>>>>>> why is it necessary to unshare c2?
>>>>>>>>
>>>>>>>> Please split out the PHI-with-multi-arg handling  (I have not looked at
>>>>>>>> that in detail).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Changelog.
>>>>>>>>>
>>>>>>>>> 2014-10-13  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>
>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function 
>>>>>>>>> clone.
>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>> (edge_predicate): New function.
>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always
>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block
>>>>>>>>> for join blocks if it exists.
>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
>>>>>>>>> destination block of edge is not always executed. Set-up predicate
>>>>>>>>> for critical edge.
>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if
>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using
>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under 
>>>>>>>>> FLAG_FORCE_VECTORIZE.
>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>> algorithm is used.
>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE.
>>>>>>>>> Invoke find_insertion_point to initialize gsi and
>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
>>>>>>>>> that extended predication must be applied).
>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>>>>>>> for innermost loop marked with pragma omp simd and
>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of 
>>>>>>>>> edges
>>>>>>>>> for blocks with two successors.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>:
>>>>>>>>>> Richard,
>>>>>>>>>>
>>>>>>>>>> here is reduced patch (part.1) which was reduced almost twice.
>>>>>>>>>> Let's me also answer on your comments.
>>>>>>>>>>
>>>>>>>>>> 1. I really use edge field 'aux' to keep predicate for critical 
>>>>>>>>>> edges.
>>>>>>>>>> My previous code was not correct and now it looks like:
>>>>>>>>>>
>>>>>>>>>>   if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
>>>>>>>>>>     /* Edge E is not critical,  use predicate of edge source bb. */
>>>>>>>>>>     c = bb_predicate (b);
>>>>>>>>>>   else
>>>>>>>>>>     /* Edge E is critical and its aux field contains predicate.  */
>>>>>>>>>>     c = edge_predicate (e);
>>>>>>>>>>
>>>>>>>>>> 2. I completely delete all code related to creation of conditional
>>>>>>>>>> expressions and completely rely on bool pattern recognition in
>>>>>>>>>> vectorizer. But we need to delete all dead predicate computations
>>>>>>>>>> which are not used since they prevent vectorization. I will add this
>>>>>>>>>> local-dce function in next patch.
>>>>>>>>>> 3. I also did not include in this patch recognition of general
>>>>>>>>>> phi-nodes with two arguments only for which conversion of conditional
>>>>>>>>>> scalar reduction can be applied also.
>>>>>>>>>> Note that all these changes are applied for loop marked with pragma
>>>>>>>>>> omp simd only.
>>>>>>>>>>
>>>>>>>>>> 2014-09-22  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>
>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function 
>>>>>>>>>> clone.
>>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>>> (edge_predicate): New function.
>>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always
>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block
>>>>>>>>>> for join blocks if it exists.
>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
>>>>>>>>>> destination block of edge is not always executed. Set-up predicate
>>>>>>>>>> for critical edge.
>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up.
>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>>>>>>>>> to reject block if-conversion with incoming critical edges only if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up.
>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if
>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using
>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under 
>>>>>>>>>> FLAG_FORCE_VECTORIZE.
>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>> algorithm is used.
>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE.
>>>>>>>>>> Invoke find_insertion_point to initialize gsi and
>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
>>>>>>>>>> that extended predication must be applied).
>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>>>>>>>>> for innermost loop marked with pragma omp simd and
>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of 
>>>>>>>>>> edges
>>>>>>>>>> for blocks with two successors.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2014-09-08 17:10 GMT+04:00 Richard Biener 
>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev 
>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>> Richard!
>>>>>>>>>>>> Here is updated patch with the following changes:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Any restrictions on phi-function were eliminated for extended 
>>>>>>>>>>>> conversion.
>>>>>>>>>>>> 2.  Put predicate for critical edges to 'aux' field of edge, i.e.
>>>>>>>>>>>> negate_predicate was deleted.
>>>>>>>>>>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges 
>>>>>>>>>>>> can
>>>>>>>>>>>> be critical.
>>>>>>>>>>>> 4. Use notion of cd-equivalence to set-up predicate for join basic
>>>>>>>>>>>> blocks to simplify it.
>>>>>>>>>>>> 5. I decided to not design pre-pass since it will lead generating
>>>>>>>>>>>> chain of cond expressions for phi-node if conversion, whereas for 
>>>>>>>>>>>> phi
>>>>>>>>>>>> of kind
>>>>>>>>>>>>   x = PHI <1(2), 1(3), 2(4)>
>>>>>>>>>>>> only one cond expression is required and this is considered as 
>>>>>>>>>>>> simple
>>>>>>>>>>>> optimization for arbitrary phi-function. More precise,
>>>>>>>>>>>> if phi-function have only two different arguments and one of them 
>>>>>>>>>>>> has
>>>>>>>>>>>> single occurrence, if- conversion is performed as if phi have only 
>>>>>>>>>>>> 2
>>>>>>>>>>>> arguments.
>>>>>>>>>>>> For arbitrary phi function a chain of cond expressions is produced.
>>>>>>>>>>>>
>>>>>>>>>>>> Updated patch is attached.
>>>>>>>>>>>>
>>>>>>>>>>>> Any comments will be appreciated.
>>>>>>>>>>>
>>>>>>>>>>> The patch is still very big and does multiple things at once which 
>>>>>>>>>>> makes
>>>>>>>>>>> it hard to review.
>>>>>>>>>>>
>>>>>>>>>>> In addition to that it changes function singatures without updating
>>>>>>>>>>> the function comments.  For example what is the convert_bool
>>>>>>>>>>> argument doing to add_to_dst_predicate_list?  Why do we need
>>>>>>>>>>> all this added logic.
>>>>>>>>>>>
>>>>>>>>>>> You duplicate operand_equal_for_phi_arg_p.
>>>>>>>>>>>
>>>>>>>>>>> I think the code handling PHIs with more than two operands but
>>>>>>>>>>> only two unequal operands is useful generally, so that's an obvious
>>>>>>>>>>> candidate for splitting out into a separate patch.
>>>>>>>>>>>
>>>>>>>>>>> +   CONVERT_BOOL argument was added to convert bool predicate 
>>>>>>>>>>> computations
>>>>>>>>>>> +   which is not supported by vectorizer to int type through 
>>>>>>>>>>> creating of
>>>>>>>>>>> +   conditional expressions.  */
>>>>>>>>>>>
>>>>>>>>>>> Example?  The vectorizer has patterns for bool predicate 
>>>>>>>>>>> computations.
>>>>>>>>>>> This seems to be another feature that needs splitting out.
>>>>>>>>>>>
>>>>>>>>>>> The way you get around the critical edge parts looks awkward to me.
>>>>>>>>>>> Please either do _all_ predicates as edge predicates or simply
>>>>>>>>>>> split critical edges (of the respective loop body).
>>>>>>>>>>>
>>>>>>>>>>> I still think that an utility doing same PHI arg merging by 
>>>>>>>>>>> introducing
>>>>>>>>>>> forwarder blocks would be nicer to have.
>>>>>>>>>>>
>>>>>>>>>>> I'd restructure the main tree_if_conversion function to apply these
>>>>>>>>>>> CFG pre-transforms when we are going to version the loop
>>>>>>>>>>> for if conversion (eventually transitioning to always doing that).
>>>>>>>>>>>
>>>>>>>>>>> So - please split up the patch.  It's way too big.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Richard.
>>>>>>>>>>>
>>>>>>>>>>>> 2014-08-15  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>>
>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function 
>>>>>>>>>>>> clone.
>>>>>>>>>>>> (flag_force_vectorize): New variable.
>>>>>>>>>>>> (edge_predicate): New function.
>>>>>>>>>>>> (set_edge_predicate): New function.
>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>>>> (get_type_for_cond): New function.
>>>>>>>>>>>> (convert_bool_predicate): New function.
>>>>>>>>>>>> (predicate_disjunction): New function.
>>>>>>>>>>>> (predicate_conjunction): New function.
>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>>>>>>>>> Use predicate of cd-equivalent block if convert_bool is true and
>>>>>>>>>>>> such bb exists; save it in static variable for further possible 
>>>>>>>>>>>> use.
>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>>>>>>>>> Add early function exit if edge target block is always executed.
>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>>>>>>>>> Set-up predicate for crritical edge if convert_bool is true.
>>>>>>>>>>>> (equal_phi_args): New function.
>>>>>>>>>>>> (phi_has_two_different_args): New function.
>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>>>> if flag_force_vectorize wa set-up.
>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
>>>>>>>>>>>> (if_convertible_stmt_p): Allow calls of function clones if
>>>>>>>>>>>> flag_force_vectorize was set-up.
>>>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>>>> flag_force_vectorize was set-up. Use call of all_edges_are_critical
>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only if
>>>>>>>>>>>> flag_force_vectorize was not set-up.
>>>>>>>>>>>> (walk_cond_tree): New function.
>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument which is used to 
>>>>>>>>>>>> transform
>>>>>>>>>>>> comparison expressions of boolean type into conditional expressions
>>>>>>>>>>>> with integral operands. If convert_bool argument was set-up and
>>>>>>>>>>>> vect bool pattern can be appied perform the following 
>>>>>>>>>>>> transformation:
>>>>>>>>>>>> (bool) x != 0  --> y = (int) x; x != 0;
>>>>>>>>>>>> Add check that if fold_build2 produces bool conversion if 
>>>>>>>>>>>> convert_bool
>>>>>>>>>>>> was set-up, recompute predicate using build2_loc. Additional 
>>>>>>>>>>>> argument
>>>>>>>>>>>> 'convert_bool" is passed to add_to_dst_predicate_list and
>>>>>>>>>>>> add_to_predicate_list.
>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>>>>>>>>>>> flag_force_vectorize was set-up to calculate cd equivalent bb's.
>>>>>>>>>>>> Call predicate_bbs with additional argument equal to false.
>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>>>> extended phi node predication. If number of predecessors of 
>>>>>>>>>>>> phi-block
>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>>>> algorithm is used.
>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals 
>>>>>>>>>>>> that
>>>>>>>>>>>> phi arguments must be evaluated through phi_has_two_different_args.
>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if 
>>>>>>>>>>>> cond
>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of 
>>>>>>>>>>>> is_cond_scalar_reduction.
>>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>>> (predicate_arbitrary_phi): New function.
>>>>>>>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>>>>>>>>> predication to build mask.
>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop 
>>>>>>>>>>>> versioning
>>>>>>>>>>>> for innermost loop marked with pragma omp simd.
>>>>>>>>>>>>
>>>>>>>>>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener 
>>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev 
>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We implemented additional support for pragma omp simd in part of
>>>>>>>>>>>>>> extended if-conversion loops with such pragma. These extensions
>>>>>>>>>>>>>> include:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. All extensions are performed only if considered loop or its 
>>>>>>>>>>>>>> outer
>>>>>>>>>>>>>>    loop was marked with pragma omp simd (force_vectorize); For 
>>>>>>>>>>>>>> ordinary
>>>>>>>>>>>>>>    loops behavior was not changed.
>>>>>>>>>>>>>> 2. Took off cfg restriction on basic block which can have more 
>>>>>>>>>>>>>> than 2
>>>>>>>>>>>>>>    predecessors.
>>>>>>>>>>>>>> 3. Put additional restriction on phi nodes which was missed in 
>>>>>>>>>>>>>> current design:
>>>>>>>>>>>>>>    all phi nodes must be in non-predicated basic block to conform
>>>>>>>>>>>>>>    semantic of COND_EXPR which is used for transformation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> How is that so?  If the PHI is predicated then its result will be 
>>>>>>>>>>>>> used
>>>>>>>>>>>>> in a PHI node again and thus we'd create a sequence of COND_EXPRs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> No?
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 4. Extend predication of phi nodes: phi may have more than 2 
>>>>>>>>>>>>>> arguments
>>>>>>>>>>>>>> with some limitations:
>>>>>>>>>>>>>>    - for phi nodes which have more than 2 arguments, but only two
>>>>>>>>>>>>>>    arguments are different and one of them has the only 
>>>>>>>>>>>>>> occurence,
>>>>>>>>>>>>>> transformation to  single COND_EXPR can be done.
>>>>>>>>>>>>>>    - if phi node has more different arguments and all edge 
>>>>>>>>>>>>>> predicates
>>>>>>>>>>>>>>    correspondent to phi-arguments are disjoint, a chain of 
>>>>>>>>>>>>>> COND_EXPR
>>>>>>>>>>>>>>    will be generated for it. In current design very simple check 
>>>>>>>>>>>>>> is used:
>>>>>>>>>>>>>>    check starting from end that two edges correspondent to 
>>>>>>>>>>>>>> neighbor
>>>>>>>>>>>>>> arguments have common predecessor which is used for further check
>>>>>>>>>>>>>> with next edge.
>>>>>>>>>>>>>>  These guarantee that phi predication will produce the correct 
>>>>>>>>>>>>>> result.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Btw, you can think of these extensions as unfactoring a PHI node 
>>>>>>>>>>>>> by
>>>>>>>>>>>>> inserting forwarder blocks.  Thus
>>>>>>>>>>>>>
>>>>>>>>>>>>>    x = PHI <1(2), 1(3), 2(4)>
>>>>>>>>>>>>>
>>>>>>>>>>>>> becomes
>>>>>>>>>>>>>
>>>>>>>>>>>>>   bb 5: <forwarder-from(2)-and(3)>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   x = PHI <1(5), 2(4)>
>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>>>>>>>>
>>>>>>>>>>>>>   x = PHI <1(2), 2(3), 3(4)>
>>>>>>>>>>>>>
>>>>>>>>>>>>> becomes
>>>>>>>>>>>>>
>>>>>>>>>>>>>   bb 5:
>>>>>>>>>>>>>   x' = PHI <1(2), 2(3)>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   b = PHI<x'(5), 3(4)>
>>>>>>>>>>>>>
>>>>>>>>>>>>> which means that 3) has to work.  Note that we want this kind of
>>>>>>>>>>>>> PHI transforms for out-of-SSA as well to reduce the number of
>>>>>>>>>>>>> copies we need to insert on edges.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thus it would be nice if you implemented 4) in terms of a pre-pass
>>>>>>>>>>>>> over the force_vect loops PHI nodes, applying that CFG transform.
>>>>>>>>>>>>> And make 3) work properly if it doesn't already.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks like you introduce a "negate predicate" to work around 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> critical edge limitation?  Please instead change if-conversion to
>>>>>>>>>>>>> work with edge predicates (as opposed to BB predicates).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is example of such extended predication (compile with 
>>>>>>>>>>>>>> -march=core-avx2):
>>>>>>>>>>>>>> #pragma omp simd safelen(8)
>>>>>>>>>>>>>>   for (i=0; i<512; i++)
>>>>>>>>>>>>>>   {
>>>>>>>>>>>>>>     float t = a[i];
>>>>>>>>>>>>>>     if (t > 0 & t < 1.0e+17f)
>>>>>>>>>>>>>>       if (c[i] != 0)
>>>>>>>>>>>>>> res += 1;
>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>   <bb 4>:
>>>>>>>>>>>>>>   # res_15 = PHI <res_1(5), 0(3)>
>>>>>>>>>>>>>>   # i_16 = PHI <i_11(5), 0(3)>
>>>>>>>>>>>>>>   # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>>>>>>>>>>>>>   t_5 = a[i_16];
>>>>>>>>>>>>>>   _6 = t_5 > 0.0;
>>>>>>>>>>>>>>   _7 = t_5 < 9.9999998430674944e+16;
>>>>>>>>>>>>>>   _8 = _7 & _6;
>>>>>>>>>>>>>>   _ifc__28 = (unsigned int) _8;
>>>>>>>>>>>>>>   _10 = &c[i_16];
>>>>>>>>>>>>>>   _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>>>>>>>>>>>>>   _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>>>>>>>>>>>>>   _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>>>>>>>>>>>>>   _ifc__30 = (int) _ifc__29;
>>>>>>>>>>>>>>   _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>>>>>>>>>>>>>   _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>>>>>>>>>>>>>   _ifc__33 = (int) _ifc__32;
>>>>>>>>>>>>>>   _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>>>>>>>>>>>>>   _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>>>>>>>>>>>>>   res_1 = res_15 + _ifc__35;
>>>>>>>>>>>>>>   i_11 = i_16 + 1;
>>>>>>>>>>>>>>   ivtmp_14 = ivtmp_17 - 1;
>>>>>>>>>>>>>>   if (ivtmp_14 != 0)
>>>>>>>>>>>>>>     goto <bb 4>;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> gcc/ChageLog
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2014-06-25  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable.
>>>>>>>>>>>>>> (struct bb_predicate_s): Add negate_predicate field.
>>>>>>>>>>>>>> (bb_negate_predicate): New function.
>>>>>>>>>>>>>> (set_bb_negate_predicate): New function.
>>>>>>>>>>>>>> (bb_copy_predicate): New function.
>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate 
>>>>>>>>>>>>>> field.
>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>>>>>>>>>>> (convert_name_to_cmp): New function.
>>>>>>>>>>>>>> (get_type_for_cond): New function.
>>>>>>>>>>>>>> (convert_bool_predicate): New function.
>>>>>>>>>>>>>> (predicate_disjunction): New function.
>>>>>>>>>>>>>> (predicate_conjunction): New function.
>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is 
>>>>>>>>>>>>>> true.
>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>>>>>>>>>>> Add early function exit if edge target block is always executed.
>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is 
>>>>>>>>>>>>>> true.
>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>>>>>>>>>>> (equal_phi_args): New function.
>>>>>>>>>>>>>> (phi_has_two_different_args): New function.
>>>>>>>>>>>>>> (phi_args_disjoint): New function.
>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>>>>>>>>>>> for loops marked with pragma omp simd. Add check that phi nodes 
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> in non-predicated basic blocks.
>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>>>>>>>>>>>>>> (all_edges_are_critical): New function.
>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>>>>>>>>>>> flag_force_vectorize was setup. Use call of 
>>>>>>>>>>>>>> all_edges_are_critical
>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only 
>>>>>>>>>>>>>> if
>>>>>>>>>>>>>> flag_force_vectorize was not setup.
>>>>>>>>>>>>>> (walk_cond_tree): New function.
>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument that is used to 
>>>>>>>>>>>>>> transform
>>>>>>>>>>>>>> comparison expressions of boolean type into conditional 
>>>>>>>>>>>>>> expressions
>>>>>>>>>>>>>> with integral operands. If bool_conv argument is false or both
>>>>>>>>>>>>>> outgoing edges are not critical old algorithm of predicate 
>>>>>>>>>>>>>> assignments
>>>>>>>>>>>>>> is used, otherwise the following code was added: check on 
>>>>>>>>>>>>>> applicable
>>>>>>>>>>>>>> of vect-bool-pattern recognition and trnasformation of
>>>>>>>>>>>>>> (bool) x != 0  --> y = (int) x; x != 0;
>>>>>>>>>>>>>> compute predicates for both outgoing edges one of which is 
>>>>>>>>>>>>>> critical
>>>>>>>>>>>>>> one using 'normal' edge, i.e. compute true and false predicates 
>>>>>>>>>>>>>> using
>>>>>>>>>>>>>> normal outgoing edge only; evaluated predicates are stored in
>>>>>>>>>>>>>> predicate and negate_predicate fields of struct bb_predicate_s 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> negate_predicate of normal edge conatins predicate of critical 
>>>>>>>>>>>>>> edge,
>>>>>>>>>>>>>> but generated gimplified statements are stored in their 
>>>>>>>>>>>>>> destination
>>>>>>>>>>>>>> block fields. Additional argument 'convert_bool" is passed to
>>>>>>>>>>>>>> add_to_dst_predicate_list and add_to_predicate_list.
>>>>>>>>>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional 
>>>>>>>>>>>>>> argument
>>>>>>>>>>>>>> equal to false.
>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of
>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of 
>>>>>>>>>>>>>> phi-block
>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>>>>>>>>>>> algorithm is used.
>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which 
>>>>>>>>>>>>>> signals that
>>>>>>>>>>>>>> both phi arguments must be evaluated through 
>>>>>>>>>>>>>> phi_has_two_different_args.
>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if 
>>>>>>>>>>>>>> cond
>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of 
>>>>>>>>>>>>>> is_cond_scalar_reduction.
>>>>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>>>>> (predicate_phi_disjoint_args): New function.
>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion.
>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>>>>>>>>>>> predication to build mask.
>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>>>>>>>>>>> (split_crit_edge): New function.
>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from 
>>>>>>>>>>>>>> current
>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare). Invoke
>>>>>>>>>>>>>> split_crit_edge for extended predication. Do loop versioning for
>>>>>>>>>>>>>> innermost loop marked with pragma omp simd.

Reply via email to