On Tue, May 1, 2012 at 8:36 AM, Ramana Radhakrishnan
<[email protected]> wrote:
> Sorry about the delayed response, I've been away for some time.
>
>>
>> I don't exactly understand why the general transform is not advisable.
>> We already synthesize min/max operations.
>
>
>>
>> Can you elaborate on why you think that better code might be generated
>> when not doing this transform?
>
> The reason why I wasn't happy was because of the code we ended up
> generating in this case for ARM comparing the simple examples showed
> the following difference - while I'm pretty sure I can massage the
> backend to generate the right form in this case with splitters I
> probably didn't realize this when I wrote the mail. Given this, I
> wonder if it is worth in general doing this transformation in a fold
> type operation rather than restricting ourselves only to invariant
> operands ?
Yes, I think doing this generally would be beneficial. Possible places to
hook this up are tree-ssa-forwprop.c if you have
tem1 = i < x;
tem2 = i < y;
tem3 = tem1 && tem2;
if (tem3)
or tree-ssa-ifcombine.c if you instead see
if (i < x)
if (i < y)
...
Richard.
>
> The canonical example is as below :
>
>
> #define min(x, y) ((x) < (y)) ? (x) : (y)
> int foo (int i, int x ,int y)
> {
> // return ( i < x) && (i < y);
> return i < (min (x, y));
> }
>
>
> Case with min_expr:
>
> cmp r2, r1 @ 8 *arm_smin_insn/1 [length = 8]
> movge r2, r1
> cmp r2, r0 @ 23 *arm_cmpsi_insn/3 [length = 4]
> movle r0, #0 @ 24 *p *arm_movsi_insn/2 [length = 4]
> movgt r0, #1 @ 25 *p *arm_movsi_insn/2 [length = 4]
> bx lr @ 28 *arm_return [length = 12]
>
>
> This might well be .
>
> cmp r2, r0
> cmpge r1, r0
> movle r0, #0
> movgt r0, #1
> bx lr
>
> Case without min_expr:
>
> cmp r0, r2 @ 28 *cmp_and/6 [length = 8]
> cmplt r0, r1
> movge r0, #0 @ 29 *mov_scc [length = 8]
> movlt r0, #1
> bx lr @ 32 *arm_return [length = 12]
>
>
>
>>
>>> #define min(x,y) ((x) <= (y) ? (x) : (y))
>>>
>>> void foo (int x, int y, int * a, int * b, int *c)
>>> {
>>> int i;
>>>
>>> for (i = 0;
>>> i < x && i < y;
>>> /* i < min (x, y); */
>>> i++)
>>> a[i] = b[i] * c[i];
>>>
>>> }
>>>
>>> The patch below deals with this case and I'm guessing that it could
>>> also handle more of the comparison cases and come up with more
>>> intelligent choices and should be made quite a lot more robust than
>>> what it is right now.
>>
>> Yes. At least if you have i < 5 && i < y we canonicalize it to
>> i <= 4 && i < y, so your pattern matching would fail.
>
> Of-course considering overflow semantics you could transform this to
> i < min (x +1, y) where the original condition was i <= x && i < y.
>
> Thinking about it , it's probably right to state that
>
> i op1 X && i op2 Y => i op min (X1, Y1)
>
> when op1 and op2 are identical or according to the table below :
>
> op1 op2 op X1 Y1
> < <= <= X + 1 Y
> > >= > X Y + 1
> < = < <= X Y + 1
> >= > > X + 1 Y
>
>
> Other than being careful about overflow semantics the second table
> is probably worthwhile looking at -
>
>>
>> Btw, the canonical case this happens in is probably
>>
>> for (i = 0; i < n; ++i)
>> for (j = 0; j < m && j < i; ++j)
>> a[i][j] = ...
>>
>> thus iterating over the lower/upper triangular part of a non-square matrix
>> (including or not including the diagonal, thus also j < m && j <= i
>
> Ok thanks - fair enough .
>
>
> Ramana
>
>>
>> Richard.
>>
>>> regards,
>>> Ramana
>>>
>>>
>>>
>>> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
>>> index ce5eb20..a529536 100644
>>> --- a/gcc/tree-ssa-loop-im.c
>>> +++ b/gcc/tree-ssa-loop-im.c
>>> @@ -563,6 +563,7 @@ stmt_cost (gimple stmt)
>>>
>>> switch (gimple_assign_rhs_code (stmt))
>>> {
>>> + case MIN_EXPR:
>>> case MULT_EXPR:
>>> case WIDEN_MULT_EXPR:
>>> case WIDEN_MULT_PLUS_EXPR:
>>> @@ -971,6 +972,124 @@ rewrite_reciprocal (gimple_stmt_iterator *bsi)
>>> return stmt1;
>>> }
>>>
>>> +/* We look for a sequence that is :
>>> + def_stmt1 : x = a < b
>>> + def_stmt2 : y = a < c
>>> + stmt: z = x & y
>>> + use_stmt_cond: if ( z != 0)
>>> +
>>> + where b, c are loop invariant .
>>> +
>>> + In which case we might as well replace this by :
>>> +
>>> + t = min (b, c)
>>> + if ( a < t )
>>> +*/
>>> +
>>> +static gimple
>>> +rewrite_min_test (gimple_stmt_iterator *bsi)
>>> +{
>>> + gimple stmt, def_stmt_x, def_stmt_y, use_stmt_cond, stmt1;
>>> + tree x, y, z, a, b, c, var, t, name;
>>> + use_operand_p use;
>>> + bool is_lhs_of_comparison = false;
>>> +
>>> + stmt = gsi_stmt (*bsi);
>>> + z = gimple_assign_lhs (stmt);
>>> +
>>> + /* We start by looking at whether x is used in the
>>> + right set of conditions. */
>>> + if (TREE_CODE (z) != SSA_NAME
>>> + || !single_imm_use (z, &use, &use_stmt_cond)
>>> + || gimple_code (use_stmt_cond) != GIMPLE_COND)
>>> + return stmt;
>>> +
>>> + x = gimple_assign_rhs1 (stmt);
>>> + y = gimple_assign_rhs2 (stmt);
>>> +
>>> + if (TREE_CODE (x) != SSA_NAME
>>> + || TREE_CODE (y) != SSA_NAME)
>>> + return stmt;
>>> +
>>> + def_stmt_x = SSA_NAME_DEF_STMT (x);
>>> + def_stmt_y = SSA_NAME_DEF_STMT (y);
>>> +
>>> + /* def_stmt_x and def_stmt_y should be of the
>>> + form
>>> +
>>> + x = a cmp b
>>> + y = a cmp c
>>> +
>>> + or
>>> +
>>> + x = b cmp a
>>> + y = c cmp a
>>> + */
>>> + if (!is_gimple_assign (def_stmt_x)
>>> + || !is_gimple_assign (def_stmt_y)
>>> + || (gimple_assign_rhs_code (def_stmt_x)
>>> + != gimple_assign_rhs_code (def_stmt_y)))
>>> + return stmt;
>>> +
>>> + if (gimple_assign_rhs1 (def_stmt_x) == gimple_assign_rhs1 (def_stmt_y)
>>> + && (gimple_assign_rhs_code (def_stmt_x) == LT_EXPR
>>> + || gimple_assign_rhs_code (def_stmt_x) == LE_EXPR))
>>> + {
>>> + a = gimple_assign_rhs1 (def_stmt_x);
>>> + b = gimple_assign_rhs2 (def_stmt_x);
>>> + c = gimple_assign_rhs2 (def_stmt_y);
>>> + is_lhs_of_comparison = true;
>>> + }
>>> + else
>>> + {
>>> + if (gimple_assign_rhs2 (def_stmt_x) == gimple_assign_rhs2
>>> (def_stmt_y)
>>> + && (gimple_assign_rhs_code (def_stmt_x) == GT_EXPR
>>> + || gimple_assign_rhs_code (def_stmt_x) == GE_EXPR))
>>> + {
>>> + a = gimple_assign_rhs2 (def_stmt_x);
>>> + b = gimple_assign_rhs1 (def_stmt_x);
>>> + c = gimple_assign_rhs1 (def_stmt_y);
>>> + }
>>> + else
>>> + return stmt;
>>> + }
>>> +
>>> + if (outermost_invariant_loop (b, loop_containing_stmt (def_stmt_x)) !=
>>> NULL
>>> + && outermost_invariant_loop (c, loop_containing_stmt
>>> (def_stmt_y)) != NULL)
>>> +
>>> + {
>>> + if (dump_file)
>>> + fprintf (dump_file, "Found a potential transformation to min\n");
>>> +
>>> + /* mintmp = min (b , c). */
>>> +
>>> + var = create_tmp_var (TREE_TYPE (b), "mintmp");
>>> + add_referenced_var (var);
>>> + t = fold_build2 (MIN_EXPR, TREE_TYPE (b), b, c);
>>> + stmt1 = gimple_build_assign (var, t);
>>> + name = make_ssa_name (var, stmt1);
>>> + gimple_assign_set_lhs (stmt1, name);
>>> + gimple_cond_set_code (use_stmt_cond, gimple_assign_rhs_code
>>> (def_stmt_x));
>>> +
>>> +
>>> + if (is_lhs_of_comparison)
>>> + {
>>> + gimple_cond_set_lhs (use_stmt_cond, a);
>>> + gimple_cond_set_rhs (use_stmt_cond, name);
>>> + }
>>> + else
>>> + {
>>> + gimple_cond_set_lhs (use_stmt_cond, name);
>>> + gimple_cond_set_rhs (use_stmt_cond, a);
>>> + }
>>> + update_stmt (use_stmt_cond);
>>> + gsi_insert_before (bsi, stmt1, GSI_NEW_STMT);
>>> + return stmt1;
>>> + }
>>> +
>>> + return stmt;
>>> +}
>>> +
>>> /* Check if the pattern at *BSI is a bittest of the form
>>> (A >> B) & 1 != 0 and in this case rewrite it to A & (1 << B) != 0. */
>>>
>>> @@ -1171,6 +1290,11 @@ determine_invariantness_stmt (struct
>>> dom_walk_data *dw_data ATTRIBUTE_UNUSED,
>>> && TREE_CODE (op0) == SSA_NAME
>>> && has_single_use (op0))
>>> stmt = rewrite_bittest (&bsi);
>>> + else
>>> + if (pos == MOVE_POSSIBLE
>>> + && gimple_assign_rhs_code (stmt) == BIT_AND_EXPR)
>>> + stmt = rewrite_min_test (&bsi);
>>> +
>>> }
>>>
>>> lim_data = init_lim_data (stmt);