On Wed, Oct 31, 2012 at 8:16 PM, Easwaran Raman <[email protected]> wrote:
> On Wed, Oct 31, 2012 at 4:36 AM, Richard Biener
> <[email protected]> wrote:
>> On Wed, Oct 24, 2012 at 4:02 AM, Easwaran Raman <[email protected]> wrote:
>>> On Tue, Oct 23, 2012 at 2:52 AM, Richard Biener
>>> <[email protected]> wrote:
>>>> On Mon, Oct 22, 2012 at 8:31 PM, Easwaran Raman <[email protected]> wrote:
>>>>> On Mon, Oct 22, 2012 at 12:59 AM, Richard Biener
>>>>> <[email protected]> wrote:
>>>>>> On Fri, Oct 19, 2012 at 12:36 AM, Easwaran Raman <[email protected]>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> During expression reassociation, statements are conservatively moved
>>>>>>> downwards to ensure that dependences are correctly satisfied after
>>>>>>> reassocation. This could lead to lengthening of live ranges. This
>>>>>>> patch moves statements only to the extent necessary. Bootstraps and no
>>>>>>> test regression on x86_64/linux. OK for trunk?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Easwaran
>>>>>>>
>>>>>>> 2012-10-18 Easwaran Raman <[email protected]>
>>>>>>> * tree-ssa-reassoc.c(assign_uids): New function.
>>>>>>> (assign_uids_in_relevant_bbs): Likewise.
>>>>>>> (ensure_ops_are_available): Likewise.
>>>>>>> (rewrite_expr_tree): Do not move statements beyond what is
>>>>>>> necessary. Remove call to swap_ops_for_binary_stmt...
>>>>>>> (reassociate_bb): ... and move it here.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Index: gcc/tree-ssa-reassoc.c
>>>>>>> ===================================================================
>>>>>>> --- gcc/tree-ssa-reassoc.c (revision 192487)
>>>>>>> +++ gcc/tree-ssa-reassoc.c (working copy)
>>>>>>> @@ -2250,6 +2250,128 @@ swap_ops_for_binary_stmt (VEC(operand_entry_t,
>>>>>>> hea
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> +/* Assign UIDs to statements in basic block BB. */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +assign_uids (basic_block bb)
>>>>>>> +{
>>>>>>> + unsigned uid = 0;
>>>>>>> + gimple_stmt_iterator gsi;
>>>>>>> + /* First assign uids to phis. */
>>>>>>> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>>> + {
>>>>>>> + gimple stmt = gsi_stmt (gsi);
>>>>>>> + gimple_set_uid (stmt, uid++);
>>>>>>> + }
>>>>>>> +
>>>>>>> + /* Then assign uids to stmts. */
>>>>>>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>>> + {
>>>>>>> + gimple stmt = gsi_stmt (gsi);
>>>>>>> + gimple_set_uid (stmt, uid++);
>>>>>>> + }
>>>>>>> +}
>>>>>>> +
>>>>>>> +/* For each operand in OPS, find the basic block that contains the
>>>>>>> statement
>>>>>>> + which defines the operand. For all such basic blocks, assign UIDs.
>>>>>>> */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +assign_uids_in_relevant_bbs (VEC(operand_entry_t, heap) * ops)
>>>>>>> +{
>>>>>>> + operand_entry_t oe;
>>>>>>> + int i;
>>>>>>> + struct pointer_set_t *seen_bbs = pointer_set_create ();
>>>>>>> +
>>>>>>> + for (i = 0; VEC_iterate (operand_entry_t, ops, i, oe); i++)
>>>>>>> + {
>>>>>>> + gimple def_stmt;
>>>>>>> + basic_block bb;
>>>>>>> + if (TREE_CODE (oe->op) != SSA_NAME)
>>>>>>> + continue;
>>>>>>> + def_stmt = SSA_NAME_DEF_STMT (oe->op);
>>>>>>> + bb = gimple_bb (def_stmt);
>>>>>>> + if (!pointer_set_contains (seen_bbs, bb))
>>>>>>> + {
>>>>>>> + assign_uids (bb);
>>>>>>> + pointer_set_insert (seen_bbs, bb);
>>>>>>> + }
>>>>>>> + }
>>>>>>> + pointer_set_destroy (seen_bbs);
>>>>>>> +}
>>>>>>
>>>>>> Please assign UIDs once using the existing renumber_gimple_stmt_uids ().
>>>>>> You seem to call the above multiple times and thus do work bigger than
>>>>>> O(number of basic blocks).
>>>>>
>>>>> The reason I call the above multiple times is that gsi_move_before
>>>>> might get called between two calls to the above. For instance, after
>>>>> rewrite_expr_tree is called once, the following sequence of calls
>>>>> could happen:
>>>>> reassociate_bb -> linearize_expr_tree -> linearize_expr ->
>>>>> gsi_move_before. So it is not sufficient to call
>>>>> renumber_gimple_stmt_uids once per do_reassoc. Or do you want me to
>>>>> use renumber_gimple_stmt_uids_in_blocks instead of
>>>>> assign_uids_in_relevant_bbs?
>>>>
>>>> It's sufficient to call it once if you conservatively update UIDs at stmt
>>>> move / insert time (assign the same UID as the stmt before/after).
>>>
>>> I was worried that that would make the code brittle, but looking at
>>> the places where a new statement is added or moved, I think it is
>>> fine. I have made the change in the new patch.
>>
>> Thanks.
>>
>>>>>
>>>>>
>>>>>>> +/* Ensure that operands in the OPS vector starting from OPINDEXth
>>>>>>> entry are live
>>>>>>> + at STMT. This is accomplished by moving STMT if needed. */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +ensure_ops_are_available (gimple stmt, VEC(operand_entry_t, heap) *
>>>>>>> ops, int opindex)
>>>>>>> +{
>>>>>>> + int i;
>>>>>>> + int len = VEC_length (operand_entry_t, ops);
>>>>>>> + gimple insert_stmt = stmt;
>>>>>>> + basic_block insert_bb = gimple_bb (stmt);
>>>>>>> + gimple_stmt_iterator gsi_insert, gsistmt;
>>>>>>> + for (i = opindex; i < len; i++)
>>>>>>> + {
>>>>>>
>>>>>> Likewise you call this for each call to rewrite_expr_tree, so it seems
>>>>>> to me
>>>>>> this is quadratic in the number of ops in the op vector.
>>>>>
>>>>> The call to ensure_ops_are_available inside rewrite_expr_tree is
>>>>> guarded by if (!moved) and I am setting moved = true there to ensure
>>>>> that ensure_ops_are_available inside is called once per reassociation
>>>>> of a expression tree.
>>>>
>>>> It would be a lot easier to understand this function if you call it once
>>>> after rewrite_expr_tree finished.
>>> I think you meant "before rewrite_expr_tree". One thing to note here
>>> is this function is called only when we first see a statement whose
>>> operand changes after reassociation. That's the reason the moving of
>>> statements happen inside rewrite_expr_tree.
>>
>> I meant after because rewrite_expr_tree ends up swapping things. But
>> I note you changed that in your revised patch - good ;)
>>
>> So now that rewrite_expr_tree does no longer change order of the ops
>> vector we can move stmts indeed before it, but backwards. Thus,
>> instead of walking forward and
>>
>> + /* Ensure that STMT1 is moved to a place where all operands in
>> + OPS[opindex + i...] are available. */
>> + ensure_ops_are_available (stmt1, ops, opindex + i);
>>
>> we can walk backward and only need to consider the immediate next operand
>> because we know all further operands are available at the place the immediate
>> next operand is defined.
>>
>> Which should remove the need for the loop in ensure_ops_are_available
>>
>> + for (i = opindex; i < len; i++)
>> + {
>>
>> Correct? (We may need to special-case the last (two?) stmts).
>
> Sorry, I am not able to understand your proposal here. The confusion
> relates to the use of walking forward vs backward. In the current
> patch, I start from the last statement (root of the expression tree)
> and follow the chain of defs using SSA_NAME_DEF_STMT (which I consider
> as walking backward). Do you want to start from the leaf of the
> expression and follow the chain in the opposite direction till the
> stmt defining the root of the tree is reached? That would indeed avoid
> comparing each statement against all statements that define the ops
> array as it is sufficient to compare against the statements that
> define the RHS operands. If that is indeed what you are proposing, I
> tried that approach and ran into issues. Doing this would mean that
> the IR is in an inconsistent state during intermediate stages. For
> instance, transforming
>
> t1 = a + b
> t2 = t1 + c
> d = ...
> t3 = t2 + d
>
> into
>
> t1 = a + d
> t2 = t1 + c
> d = ...
> t3 = t2 + b
>
> would result in a state where t2 = t1 + c appears before t1 = a + b.
> I don't know if that is the root cause, but I noticed that doing so
> resulted in ssa_verify complaining about DEBUG statements whose
> operands appeared before their defs. If this is not what you mean by
> walking backwards, could you please explain with a simple example?
Yes, that's what I mean with walking backwards. DEBUG stmts need
to be resetted when they are no longer valid. reassoc suffers from
many (existing) debug issues.
Richard.
> Thanks,
> Easwaran
>
>
>>>>
>>>>>>
>>>>>> Why make this all so complicated? It seems to me that we should
>>>>>> fixup stmt order only after the whole ops vector has been materialized.
>>>>>
>>>>>
>>>>>>
>>>>>>> + operand_entry_t oe = VEC_index (operand_entry_t, ops, i);
>>>>>>> + gimple def_stmt;
>>>>>>> + basic_block def_bb;
>>>>>>> + /* Ignore constants and operands with default definitons. */
>>>>>>> + if (TREE_CODE (oe->op) != SSA_NAME
>>>>>>> + || SSA_NAME_IS_DEFAULT_DEF (oe->op))
>>>>>>> + continue;
>>>>>>> + def_stmt = SSA_NAME_DEF_STMT (oe->op);
>>>>>>> + def_bb = gimple_bb (def_stmt);
>>>>>>> + if (def_bb != insert_bb
>>>>>>> + && !dominated_by_p (CDI_DOMINATORS, insert_bb, def_bb))
>>>>>>> + {
>>>>>>> + insert_bb = def_bb;
>>>>>>> + insert_stmt = def_stmt;
>>>>>>> + }
>>>>>>> + else if (def_bb == insert_bb
>>>>>>> + && gimple_uid (insert_stmt) < gimple_uid (def_stmt))
>>>>>>> + insert_stmt = def_stmt;
>>>>>>> + }
>>>>>>> + if (insert_stmt == stmt)
>>>>>>> + return;
>>>>>>> + gsistmt = gsi_for_stmt (stmt);
>>>>>>> + /* If GSI_STMT is a phi node, then do not insert just after that
>>>>>>> statement.
>>>>>>> + Instead, find the first non-label gimple statement in BB and
>>>>>>> insert before
>>>>>>> + that. */
>>>>>>> + if (gimple_code (insert_stmt) == GIMPLE_PHI)
>>>>>>> + {
>>>>>>> + gsi_insert = gsi_after_labels (insert_bb);
>>>>>>> + gsi_move_before (&gsistmt, &gsi_insert);
>>>>>>> + }
>>>>>>> + /* Statements marked for throw can not be in the middle of a basic
>>>>>>> block. So
>>>>>>> + we can not insert a statement (not marked for throw) immediately
>>>>>>> after. */
>>>>>>> + else if (lookup_stmt_eh_lp (insert_stmt) > 0
>>>>>>
>>>>>> that's already performed by stmt_can_throw_internal
>>>>> Ok. I was hit by the "statement marked for throw in middle of block"
>>>>> error in verify_gimple_in_cfg and simply copied the sequence of
>>>>> checks that led to that.
>>>>>
>>>>>>
>>>>>>> + && stmt_can_throw_internal (insert_stmt))SSA_NAME_DEF_STMT
>>>>>>
>>>>>> But all this should be a non-issue as re-assoc should never assign an ops
>>>>>> vector entry for such stmts (but it could have leafs defined by such
>>>>>> stmts).
>>>>>> If you only ever move definitions down, like the present code does caring
>>>>>> about leafs should not be necessary.
>>>>>
>>>>> I added this as I got an ICE in verify_gimple_in_cfg. So it seems like
>>>>> such statements gets assigned into the ops vector.
>>>>
>>>> That would be a bug. See how linearize_expr_tree checks whether
>>>> the defining stmt of an op may throw. That is, it _does_ add the
>>>> SSA def to the vector but does not recurse on its defining stmt operands.
>>>
>>> But as long as the SSA def of a throwing statement is added to the ops
>>> vector, I need to add the check, even if it doesn't recurse on its
>>> operands. Consider the following
>>>
>>> <bb1>:
>>> t1 = a + b
>>> c = ... // c might throw
>>>
>>> <bb2>:
>>> t2 = t1 + c
>>>
>>> The expression could be rewritten as (a+c) + b.
>>>
>>> Since c's definition follows t1's, t1 = a+b has to be moved after c =
>>> ... That's why I need the check.
>>
>> Hmm, indeed.
>>
>> + /* There should be exactly one normal edge out of the basic block. */
>> + FOR_EACH_EDGE (e, ei, insert_bb->succs)
>> + {
>> + if (!(e->flags & EDGE_COMPLEX))
>> + {
>> + gcc_assert (succ_edge == NULL);
>> + succ_edge = e;
>> + }
>> + }
>>
>> I think we have sth like find_fallthru_edge for this.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Easwaran
>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Easwaran
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> + {
>>>>>>> + edge e, succ_edge = NULL;
>>>>>>> + edge_iterator ei;
>>>>>>> +
>>>>>>> + /* There should be exactly one normal edge out of the basic
>>>>>>> block. */
>>>>>>> + FOR_EACH_EDGE (e, ei, insert_bb->succs)
>>>>>>> + {
>>>>>>> + if (!(e->flags & EDGE_COMPLEX))
>>>>>>> + {
>>>>>>> + gcc_assert (succ_edge == NULL);
>>>>>>> + succ_edge = e;
>>>>>>> + }
>>>>>>> + }
>>>>>>> + /* Insert STMT at the beginning of the successor basic block. */
>>>>>>> + insert_bb = succ_edge->dest;
>>>>>>> + gsi_insert = gsi_after_labels (insert_bb);
>>>>>>> + gsi_move_before (&gsistmt, &gsi_insert);
>>>>>>> + }
>>>>>>> + else
>>>>>>> + {
>>>>>>> + gsi_insert = gsi_for_stmt (insert_stmt);
>>>>>>> + gsi_move_after (&gsistmt, &gsi_insert);
>>>>>>> + }
>>>>>>> +
>>>>>>> +}
>>>>>>> +
>>>>>>> /* Recursively rewrite our linearized statements so that the operators
>>>>>>> match those in OPS[OPINDEX], putting the computation in rank
>>>>>>> order. */
>>>>>>> @@ -2262,11 +2384,6 @@ rewrite_expr_tree (gimple stmt, unsigned int
>>>>>>> opind
>>>>>>> tree rhs2 = gimple_assign_rhs2 (stmt);
>>>>>>> operand_entry_t oe;
>>>>>>>
>>>>>>> - /* If we have three operands left, then we want to make sure the ones
>>>>>>> - that get the double binary op are chosen wisely. */
>>>>>>> - if (opindex + 3 == VEC_length (operand_entry_t, ops))
>>>>>>> - swap_ops_for_binary_stmt (ops, opindex, stmt);
>>>>>>> -
>>>>>>> /* The final recursion case for this function is that you have
>>>>>>> exactly two operations left.
>>>>>>> If we had one exactly one op in the entire list to start with, we
>>>>>>> @@ -2312,19 +2429,17 @@ rewrite_expr_tree (gimple stmt, unsigned int
>>>>>>> opind
>>>>>>> {
>>>>>>> if (!moved)
>>>>>>> {
>>>>>>> - gimple_stmt_iterator gsinow, gsirhs1;
>>>>>>> - gimple stmt1 = stmt, stmt2;
>>>>>>> - unsigned int count;
>>>>>>> + gimple stmt1 = stmt;
>>>>>>> + unsigned int count, i = 1;
>>>>>>>
>>>>>>> - gsinow = gsi_for_stmt (stmt);
>>>>>>> count = VEC_length (operand_entry_t, ops) - opindex - 2;
>>>>>>> - while (count-- != 0)
>>>>>>> + while (i <= count)
>>>>>>> {
>>>>>>> - stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt1));
>>>>>>> - gsirhs1 = gsi_for_stmt (stmt2);
>>>>>>> - gsi_move_before (&gsirhs1, &gsinow);
>>>>>>> - gsi_prev (&gsinow);
>>>>>>> - stmt1 = stmt2;
>>>>>>> + stmt1 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt1));
>>>>>>> + /* Ensure that STMT1 is moved to a place where all
>>>>>>> operands in
>>>>>>> + OPS[opindex + i...] are available. */
>>>>>>> + ensure_ops_are_available (stmt1, ops, opindex + i);
>>>>>>> + i++;
>>>>>>> }
>>>>>>> moved = true;
>>>>>>> }
>>>>>>> @@ -3542,8 +3657,18 @@ reassociate_bb (basic_block bb)
>>>>>>> && VEC_length (operand_entry_t, ops) > 3)
>>>>>>> rewrite_expr_tree_parallel (stmt, width, ops);
>>>>>>> else
>>>>>>> - rewrite_expr_tree (stmt, 0, ops, false);
>>>>>>> + {
>>>>>>> + /* When there are three operands left, we want
>>>>>>> + to make sure the ones that get the double
>>>>>>> + binary op are chosen wisely. */
>>>>>>> + int len = VEC_length (operand_entry_t, ops);
>>>>>>> + if (len >= 3)
>>>>>>> + swap_ops_for_binary_stmt (ops, len - 3, stmt);
>>>>>>>
>>>>>>> + assign_uids_in_relevant_bbs (ops);
>>>>>>> + rewrite_expr_tree (stmt, 0, ops, false);
>>>>>>> + }
>>>>>>> +
>>>>>>> /* If we combined some repeated factors into a
>>>>>>> __builtin_powi call, multiply that result by the
>>>>>>> reassociated operands. */