On Mon, 29 Oct 2012, Jan Hubicka wrote:

> > > ICK ...
> > > 
> > > Why not sth as simple as
> > > 
> > >      return num_ssa_operands (stmt, SSA_OP_USE);
> > > 
> > > ?  a[1][2] and b[2] really have the same cost, variable length
> > > objects have extra SSA operands in ARRAY_REF/COMPONENT_REF for
> > > the size.  Thus, stmt cost somehow should reflect the number
> > > of dependent stmts.
> > > 
> > > So in estimate_num_insns I'd try
> > > 
> > > int
> > > estimate_num_insns (gimple stmt, eni_weights *weights)
> > > {
> > >   unsigned cost, i;
> > >   enum gimple_code code = gimple_code (stmt);
> > >   tree lhs;
> > >   tree rhs;
> > > 
> > >   switch (code)
> > >   {
> > >    case GIMPLE_ASSIGN:
> > >     /* Initialize with the number of SSA uses, one is free.  */
> > >     cost = num_ssa_operands (stmt, SSA_OP_USE);
> > >     if (cost > 1)
> > >       --cost;
> > 
> > Hi,
> > this is the udpated patch I am testing after today discussion.  I decided to
> > drop the ipa-inline-analysis changes and do that incrementally. So the patch
> > now trashes tramp3d performance by increasing need for early-inlining-insns,
> > but it is not inexpected.
> > 
> > The patch also fixes accounting of addr expr that got previously confused 
> > with
> > a load.
> > 
> > Does this seem sane?
> > 
> >     * tree-inline.c (estimate_operator_cost): Return 1 for non-trivial
> >     ADDR_EXPR operations.
> >     (estimate_num_insns): Do not confuse general single rhs with
> >     loads; account cost of non-trivial addr_expr for ASSIGN_EXPR,
> >     GIMPLE_RETURN and GIMPLE_ASM
> 
> Hi,
> this patch actually do not cause that much of tramp3d fuzz and no differences
> in testsuite due to unroling changes
> 
> The change are the constants when accounting loads and stores.
> Typical store has two SSA uses (one for address and one for value to store).
> Of course we lose difference in between array offset and pointer dereference.
> 
> Typical load/address expression has one SSA use (for the address)
> 
> Bootstrapped/regtested x86_64-linux, OK?

ChangeLog?

> Honza
> 
> Index: tree-inline.c
> ===================================================================
> --- tree-inline.c     (revision 192945)
> +++ tree-inline.c     (working copy)
> @@ -3447,6 +3447,19 @@ estimate_operator_cost (enum tree_code c
>        if (TREE_CODE (op2) != INTEGER_CST)
>          return weights->div_mod_cost;
>        return 1;
> +    case ADDR_EXPR:
> +      { 
> +        tree addr_base;
> +        HOST_WIDE_INT addr_offset;
> +
> +     addr_base = get_addr_base_and_unit_offset (TREE_OPERAND (op1, 0),
> +                                                &addr_offset);
> +     /* If the offset is variable or with non-zero offset, return 2.  */
> +     if (!addr_base || addr_offset != 0 || TREE_CODE (addr_base) != MEM_REF
> +         || !integer_zerop (TREE_OPERAND (addr_base, 1)))
> +       return 1;

The comment doesn't match the code.  Why the TREE_CODE (addr_base) != 
MEM_REF check?  If it isn't a MEM_REF then it is a plain decl, thus
no dereference.  So it's not clear what you want here ...?

It looks like you want to pessimize variable addresses here,
like &a.a[i]?  Before all ADDR_EXPR cost was zero.

I'd say you want simply

        if (!addr_base || addr_offset != 0)
          return 1;

no?  Or even

        if (!addr_base
            || (addr_offset != 0 && TREE_CODE (addr_base) == MEM_REF))
          return 1;

that keeps &decl + CST as cost 0.  Or even

        if (!addr_base)
          return 1;

that even keeps ptr + CST as cost 0.  Both because they are likely
combined with some complex addressing mode later.

> +      }
> +      return 0;

Inside the } please.

>  
>      default:
>        /* We expect a copy assignment with no operator.  */
> @@ -3512,14 +3525,24 @@ estimate_num_insns (gimple stmt, eni_wei
>        lhs = gimple_assign_lhs (stmt);
>        rhs = gimple_assign_rhs1 (stmt);
>  
> -      if (is_gimple_reg (lhs))
> -     cost = 0;
> -      else
> +      /* Store.  */
> +      if (gimple_vdef (stmt))
>       cost = estimate_move_cost (TREE_TYPE (lhs));
> +      else
> +     cost = 0;

That change seems bogus.  A !is_gimple_reg lhs is always a store.

>  
> -      if (!is_gimple_reg (rhs) && !is_gimple_min_invariant (rhs))
> +      /* Load.  */
> +      if (gimple_vuse (stmt))
>       cost += estimate_move_cost (TREE_TYPE (rhs));

Likewise.  If rhs1 is not a register or a constant this is a load.
All stores also have a VUSE so you are always accounting a store
as aggregate copy this way.

>  
> +      /* Stores, loads and address expressions may have variable array
> +      references in them. Account these.  */
> +      if (gimple_vdef (stmt))
> +     cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 2);
> +      else if (gimple_vuse (stmt)
> +               || gimple_assign_rhs_code (stmt) == ADDR_EXPR)
> +     cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
> +

Rather than doing this here in this awkward and bogus way (see above)
why not do it in estimate_operator_cost?  You added ADDR_EXPR already,
so simply add DECLs and handled-components.  Then you can do
better than estimating from num_ssa_operands of the stmt by simply
walking the handled-components looking for SSA uses in the
relevant positions.

>        cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), weights,
>                                             gimple_assign_rhs1 (stmt),
>                                     get_gimple_rhs_class 
> (gimple_assign_rhs_code (stmt))
> @@ -3596,6 +3619,7 @@ estimate_num_insns (gimple stmt, eni_wei
>  
>      case GIMPLE_RETURN:
>        return weights->return_cost;
> +          + MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);

Well ... why not estimate_operator_cost?

>      case GIMPLE_GOTO:
>      case GIMPLE_LABEL:
> @@ -3606,7 +3630,10 @@ estimate_num_insns (gimple stmt, eni_wei
>        return 0;
>  
>      case GIMPLE_ASM:
> -      return asm_str_count (gimple_asm_string (stmt));
> +      return (asm_str_count (gimple_asm_string (stmt))
> +           /* Account also possible non-trivial addr_exprs
> +              in the arguments.  */
> +           + num_ssa_operands (stmt, SSA_OP_USE));

The comment is misleading, you are counting all simple register
operands as well.  If you want to improve ASM cost then walk
over the inputs/outputs and use estimate_operator_cost again.

Richard.

Reply via email to