On Fri, Jun 16, 2017 at 6:10 PM, Bill Schmidt
<[email protected]> wrote:
> Hi,
>
> PR71815 identifies a situation where SLSR misses opportunities for
> PHI candidates when code hoisting is enabled (which is now on by
> default). The basic problem is that SLSR currently uses an overly
> simple test for profitability of the transformation. The algorithm
> currently requires that the PHI basis (through which the non-local
> SLSR candidate is propagated) has only one use, which is the
> candidate statement. The true requirement for profitability is
> that, if the candidate statement will be dead after transformation,
> then so will the PHI candidate.
>
> This patch fixes the problem by looking at the transitive reachability
> of the PHI definitions. If all paths terminate in the candidate
> statement, then we know the PHI basis will go dead and we will not
> make the code worse with the planned replacement. To avoid compile
> time issues, path search is arbitrarily terminated at depth 10. The
> new test is used throughout the cost calculation, so appears multiple
> times in the code.
>
> Also, I've added a check to avoid replacing multiply candidates with
> a stride of 1. Such a candidate is really a copy or cast statement,
> and if we replace it, we will just generate a different copy or cast
> statement. I noticed this with one of the test cases from the PR
> while debugging the problem.
>
> I've updated the two test cases that were previously enabled only
> with -fno-code-hoisting, removing that restriction.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions. I've also tested this with SPEC cpu2006 and the
> patch is performance neutral on a POWER8 box (as expected). Is
> this ok for trunk?
>
> Thanks,
> Bill
>
>
> [gcc]
>
> 2016-06-16 Bill Schmidt <[email protected]>
>
> * gimple-ssa-strength-reduction.c (uses_consumed_by_stmt): New
> function.
> (find_basis_for_candidate): Call uses_consumed_by_stmt rather than
> has_single_use.
> (slsr_process_phi): Likewise.
> (replace_uncond_cands_and_profitable_phis): Don't replace a
> multiply candidate with a stride of 1 (copy or cast).
> (phi_incr_cost): Call uses_consumed_by_stmt rather than
> has_single_use.
> (lowest_cost_path): Likewise.
> (total_savings): Likewise.
>
> [gcc/testsuite]
>
> 2016-06-16 Bill Schmidt <[email protected]>
>
> * gcc.dg/tree-ssa/slsr-35.c: Remove -fno-code-hoisting workaround.
> * gcc.dg/tree-ssa/slsr-36.c: Likewise.
>
>
> Index: gcc/gimple-ssa-strength-reduction.c
> ===================================================================
> --- gcc/gimple-ssa-strength-reduction.c (revision 239241)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -475,6 +475,48 @@ find_phi_def (tree base)
> return c->cand_num;
> }
>
> +/* Determine whether all uses of NAME are directly or indirectly
> + used by STMT. That is, we want to know whether if STMT goes
> + dead, the definition of NAME also goes dead. */
> +static bool
> +uses_consumed_by_stmt (tree name, gimple *stmt, unsigned recurse)
use a default arg 'unsigned recurse = 0' to hide this implementation
detail at users.
> +{
> + gimple *use_stmt;
> + imm_use_iterator iter;
> + bool retval = true;
> +
> + FOR_EACH_IMM_USE_STMT (use_stmt, iter, name)
> + {
> + if (use_stmt == stmt || is_gimple_debug (use_stmt))
> + continue;
> +
> + if (!is_gimple_assign (use_stmt))
> + {
> + retval = false;
> + BREAK_FROM_IMM_USE_STMT (iter);
> + }
> +
> + /* Limit recursion. */
> + if (recurse >= 10)
> + {
> + retval = false;
> + BREAK_FROM_IMM_USE_STMT (iter);
> + }
Put this limit right before the recursion.
> + tree next_name = gimple_get_lhs (use_stmt);
> + if (!next_name || !is_gimple_reg (next_name))
> + {
> + retval = false;
> + BREAK_FROM_IMM_USE_STMT (iter);
> + }
> +
> + if (uses_consumed_by_stmt (next_name, stmt, recurse + 1))
> + continue;
So this doesn't change dependent on the result which means you likely meant
if (! uses....)
{
retval = false;
BREAK...
}
which possibly also invalidates your testing?
The whole thing is probably easier to optimize if you merge the ifs
that break into one.
Richard.
> + }
> +
> + return retval;
> +}
> +
> /* Helper routine for find_basis_for_candidate. May be called twice:
> once for the candidate's base expr, and optionally again either for
> the candidate's phi definition or for a CAND_REF's alternative base
> @@ -550,7 +592,8 @@ find_basis_for_candidate (slsr_cand_t c)
>
> /* If we found a hidden basis, estimate additional dead-code
> savings if the phi and its feeding statements can be removed. */
> - if (basis && has_single_use (gimple_phi_result
> (phi_cand->cand_stmt)))
> + tree feeding_var = gimple_phi_result (phi_cand->cand_stmt);
> + if (basis && uses_consumed_by_stmt (feeding_var, c->cand_stmt, 0))
> c->dead_savings += phi_cand->dead_savings;
> }
> }
> @@ -777,7 +820,7 @@ slsr_process_phi (gphi *phi, bool speed)
>
> /* Gather potential dead code savings if the phi statement
> can be removed later on. */
> - if (has_single_use (arg))
> + if (uses_consumed_by_stmt (arg, phi, 0))
> {
> if (gimple_code (arg_stmt) == GIMPLE_PHI)
> savings += arg_cand->dead_savings;
> @@ -2384,7 +2427,9 @@ replace_uncond_cands_and_profitable_phis (slsr_can
> {
> if (phi_dependent_cand_p (c))
> {
> - if (c->kind == CAND_MULT)
> + /* A multiply candidate with a stride of 1 is just an artifice
> + of a copy or cast; there is no value in replacing it. */
> + if (c->kind == CAND_MULT && wi::to_widest (c->stride) != 1)
> {
> /* A candidate dependent upon a phi will replace a multiply by
> a constant with an add, and will insert at most one add for
> @@ -2630,8 +2675,9 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in
> if (gimple_code (arg_def) == GIMPLE_PHI)
> {
> int feeding_savings = 0;
> + tree feeding_var = gimple_phi_result (arg_def);
> cost += phi_incr_cost (c, incr, arg_def, &feeding_savings);
> - if (has_single_use (gimple_phi_result (arg_def)))
> + if (uses_consumed_by_stmt (feeding_var, phi, 0))
> *savings += feeding_savings;
> }
> else
> @@ -2644,7 +2690,7 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in
> tree basis_lhs = gimple_assign_lhs (basis->cand_stmt);
> tree lhs = gimple_assign_lhs (arg_cand->cand_stmt);
> cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs)));
> - if (has_single_use (lhs))
> + if (uses_consumed_by_stmt (lhs, phi, 0))
> *savings += stmt_cost (arg_cand->cand_stmt, true);
> }
> }
> @@ -2721,7 +2767,7 @@ lowest_cost_path (int cost_in, int repl_savings, s
> gimple *phi = lookup_cand (c->def_phi)->cand_stmt;
> local_cost += phi_incr_cost (c, incr, phi, &savings);
>
> - if (has_single_use (gimple_phi_result (phi)))
> + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0))
> local_cost -= savings;
> }
>
> @@ -2765,7 +2811,7 @@ total_savings (int repl_savings, slsr_cand_t c, co
> gimple *phi = lookup_cand (c->def_phi)->cand_stmt;
> savings -= phi_incr_cost (c, incr, phi, &phi_savings);
>
> - if (has_single_use (gimple_phi_result (phi)))
> + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0))
> savings += phi_savings;
> }
>
> Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (revision 239241)
> +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (working copy)
> @@ -3,7 +3,7 @@
> phi has an argument which is a parameter. */
>
> /* { dg-do compile } */
> -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
>
> int
> f (int c, int i)
> Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (revision 239241)
> +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (working copy)
> @@ -3,7 +3,7 @@
> phi has an argument which is a parameter. */
>
> /* { dg-do compile } */
> -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
>
> int
> f (int s, int c, int i)
>