On Mon, 29 Oct 2012, Jan Hubicka wrote: > > > ICK ... > > > > > > Why not sth as simple as > > > > > > return num_ssa_operands (stmt, SSA_OP_USE); > > > > > > ? a[1][2] and b[2] really have the same cost, variable length > > > objects have extra SSA operands in ARRAY_REF/COMPONENT_REF for > > > the size. Thus, stmt cost somehow should reflect the number > > > of dependent stmts. > > > > > > So in estimate_num_insns I'd try > > > > > > int > > > estimate_num_insns (gimple stmt, eni_weights *weights) > > > { > > > unsigned cost, i; > > > enum gimple_code code = gimple_code (stmt); > > > tree lhs; > > > tree rhs; > > > > > > switch (code) > > > { > > > case GIMPLE_ASSIGN: > > > /* Initialize with the number of SSA uses, one is free. */ > > > cost = num_ssa_operands (stmt, SSA_OP_USE); > > > if (cost > 1) > > > --cost; > > > > Hi, > > this is the udpated patch I am testing after today discussion. I decided to > > drop the ipa-inline-analysis changes and do that incrementally. So the patch > > now trashes tramp3d performance by increasing need for early-inlining-insns, > > but it is not inexpected. > > > > The patch also fixes accounting of addr expr that got previously confused > > with > > a load. > > > > Does this seem sane? > > > > * tree-inline.c (estimate_operator_cost): Return 1 for non-trivial > > ADDR_EXPR operations. > > (estimate_num_insns): Do not confuse general single rhs with > > loads; account cost of non-trivial addr_expr for ASSIGN_EXPR, > > GIMPLE_RETURN and GIMPLE_ASM > > Hi, > this patch actually do not cause that much of tramp3d fuzz and no differences > in testsuite due to unroling changes > > The change are the constants when accounting loads and stores. > Typical store has two SSA uses (one for address and one for value to store). > Of course we lose difference in between array offset and pointer dereference. > > Typical load/address expression has one SSA use (for the address) > > Bootstrapped/regtested x86_64-linux, OK?
ChangeLog? > Honza > > Index: tree-inline.c > =================================================================== > --- tree-inline.c (revision 192945) > +++ tree-inline.c (working copy) > @@ -3447,6 +3447,19 @@ estimate_operator_cost (enum tree_code c > if (TREE_CODE (op2) != INTEGER_CST) > return weights->div_mod_cost; > return 1; > + case ADDR_EXPR: > + { > + tree addr_base; > + HOST_WIDE_INT addr_offset; > + > + addr_base = get_addr_base_and_unit_offset (TREE_OPERAND (op1, 0), > + &addr_offset); > + /* If the offset is variable or with non-zero offset, return 2. */ > + if (!addr_base || addr_offset != 0 || TREE_CODE (addr_base) != MEM_REF > + || !integer_zerop (TREE_OPERAND (addr_base, 1))) > + return 1; The comment doesn't match the code. Why the TREE_CODE (addr_base) != MEM_REF check? If it isn't a MEM_REF then it is a plain decl, thus no dereference. So it's not clear what you want here ...? It looks like you want to pessimize variable addresses here, like &a.a[i]? Before all ADDR_EXPR cost was zero. I'd say you want simply if (!addr_base || addr_offset != 0) return 1; no? Or even if (!addr_base || (addr_offset != 0 && TREE_CODE (addr_base) == MEM_REF)) return 1; that keeps &decl + CST as cost 0. Or even if (!addr_base) return 1; that even keeps ptr + CST as cost 0. Both because they are likely combined with some complex addressing mode later. > + } > + return 0; Inside the } please. > > default: > /* We expect a copy assignment with no operator. */ > @@ -3512,14 +3525,24 @@ estimate_num_insns (gimple stmt, eni_wei > lhs = gimple_assign_lhs (stmt); > rhs = gimple_assign_rhs1 (stmt); > > - if (is_gimple_reg (lhs)) > - cost = 0; > - else > + /* Store. */ > + if (gimple_vdef (stmt)) > cost = estimate_move_cost (TREE_TYPE (lhs)); > + else > + cost = 0; That change seems bogus. A !is_gimple_reg lhs is always a store. > > - if (!is_gimple_reg (rhs) && !is_gimple_min_invariant (rhs)) > + /* Load. */ > + if (gimple_vuse (stmt)) > cost += estimate_move_cost (TREE_TYPE (rhs)); Likewise. If rhs1 is not a register or a constant this is a load. All stores also have a VUSE so you are always accounting a store as aggregate copy this way. > > + /* Stores, loads and address expressions may have variable array > + references in them. Account these. */ > + if (gimple_vdef (stmt)) > + cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 2); > + else if (gimple_vuse (stmt) > + || gimple_assign_rhs_code (stmt) == ADDR_EXPR) > + cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1); > + Rather than doing this here in this awkward and bogus way (see above) why not do it in estimate_operator_cost? You added ADDR_EXPR already, so simply add DECLs and handled-components. Then you can do better than estimating from num_ssa_operands of the stmt by simply walking the handled-components looking for SSA uses in the relevant positions. > cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), weights, > gimple_assign_rhs1 (stmt), > get_gimple_rhs_class > (gimple_assign_rhs_code (stmt)) > @@ -3596,6 +3619,7 @@ estimate_num_insns (gimple stmt, eni_wei > > case GIMPLE_RETURN: > return weights->return_cost; > + + MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1); Well ... why not estimate_operator_cost? > case GIMPLE_GOTO: > case GIMPLE_LABEL: > @@ -3606,7 +3630,10 @@ estimate_num_insns (gimple stmt, eni_wei > return 0; > > case GIMPLE_ASM: > - return asm_str_count (gimple_asm_string (stmt)); > + return (asm_str_count (gimple_asm_string (stmt)) > + /* Account also possible non-trivial addr_exprs > + in the arguments. */ > + + num_ssa_operands (stmt, SSA_OP_USE)); The comment is misleading, you are counting all simple register operands as well. If you want to improve ASM cost then walk over the inputs/outputs and use estimate_operator_cost again. Richard.