On Mon, 29 Jun 2015, Richard Biener wrote:

> On Fri, 26 Jun 2015, Jeff Law wrote:
> 
> > On 06/26/2015 03:24 AM, Richard Biener wrote:
> > > On Thu, 25 Jun 2015, Richard Biener wrote:
> > > 
> > > > 
> > > > This moves fold_sign_changed_comparison.  Shows up in gcc.dg/pr55833.c
> > > > 
> > > > I'll eventually massage it according to Jakubs suggestion to do a
> > > > 
> > > > #ifndef HAVE_canonicalize_funcptr_for_compare
> > > > #define HAVE_canonicalize_funcptr_for_compare 0
> > > > #endif
> > > > 
> > > > somewhere (defaults.h should work I guess).
> > > > 
> > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > > 
> > > This runs into
> > > 
> > > Running target unix//-m32
> > > FAIL: g++.dg/tree-ssa/calloc.C  -std=gnu++11  scan-tree-dump-not optimized
> > > "malloc"
> > > 
> > > where we now optimize
> > > 
> > >     n_5 = (size_t) n_4(D);
> > > ...
> > >     <bb 5>:
> > >     -  if (n_5 != 0)
> > >     +  if (n_4(D) != 0)
> > > 
> > > but both VRP and DOM fail to record equivalences for n_5 from
> > > the updated condition (I have a patch to fix VRP but not DOM).
> > So you want an equivalence recorded for n_5, even though it no longer 
> > appears
> > in the conditional (but presumably has other uses)?
> > 
> > DOM has some code for this already, but it's kind-of backwards from what
> > you're trying to do in terms of when it's used.  That code might be 
> > factorable
> > and usable elsewhere in DOM.
> 
> Not sure I came along sth like that.
> 
> In principle the following works for the testcase (even w/o fixing
> the VRP part).
> 
> Index: gcc/tree-ssa-dom.c
> ===================================================================
> --- gcc/tree-ssa-dom.c  (revision 225007)
> +++ gcc/tree-ssa-dom.c  (working copy)
> @@ -1409,6 +1409,14 @@ simplify_stmt_for_jump_threading (gimple
>    return lookup_avail_expr (stmt, false);
>  }
>  
> +static tree
> +dom_valueize (tree t)
> +{
> +  if (TREE_CODE (t) == SSA_NAME)
> +    return SSA_NAME_VALUE (t);
> +  return t;
> +}
> +
>  /* Record into the equivalence tables any equivalences implied by
>     traversing edge E (which are cached in E->aux).
>  
> @@ -1429,7 +1437,33 @@ record_temporary_equivalences (edge e)
>  
>        /* If we have a simple NAME = VALUE equivalence, record it.  */
>        if (lhs && TREE_CODE (lhs) == SSA_NAME)
> -       const_and_copies->record_const_or_copy (lhs, rhs);
> +       {
> +         gimple use_stmt;
> +         imm_use_iterator iter;
> +         const_and_copies->record_const_or_copy (lhs, rhs);
> +         FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
> +           {
> +             /* Only bother to record more equivalences for lhs that
> +                can be directly used by e->dest.
> +                ???  If the code gets re-organized to a worklist to
> +                catch more indirect opportunities and it is made to
> +                handle PHIs then this should only consider use_stmts
> +                in basic-blocks we have already visited.  */
> +             if (!dominated_by_p (CDI_DOMINATORS,
> +                                  e->dest, gimple_bb (use_stmt)))
> +               continue;
> +             tree lhs = gimple_get_lhs (use_stmt);
> +             if (lhs && TREE_CODE (lhs) == SSA_NAME)
> +               {
> +                 tree res = gimple_fold_stmt_to_constant_1 (use_stmt,
> +                                                            dom_valueize,
> +                                                            
> no_follow_ssa_edges);
> +                 if (TREE_CODE (res) == SSA_NAME
> +                     || is_gimple_min_invariant (res))
> +                   const_and_copies->record_const_or_copy (lhs, res);
> +               }
> +           }
> +       }
>  
>        /* If we have 0 = COND or 1 = COND equivalences, record them
>          into our expression hash tables.  */
> 
> 
> it's not using DOMs own stmt visiting machinery as that always modifies
> stmts in-place.  As stated in the comment it doesn't catch secondary
> opportunities.  That would be possible by using a work-list seeded
> by LHS we recorded new const/copies for and re-visiting their uses.
> You can get extra fancy here by properly handling PHIs and
> conditionals.  But it's a question of cost here, of course.
> 
> Note that I think this isn't really "backward propagation" but
> just context sensitive value-numbering.
> 
> Like VRP restricts what it inserts asserts for stmt re-visiting could
> also be constrained on availability of uses dominated by the edge
> providing the original equivalency.
> 
> So in some sense the above is a hack (similar to all the special-cases
> VRP has for similar situations).  Under what constraints do you
> think sth like the above is ok to put on trunk?

Ok, the above isn't the correct place (seems to be used from the
threading machinery only), but record_equivalences_from_incoming_edge is
and that is where the special-case you mention is which handles
widening converts but not sign-changes.  And yes, it's the wrong-way
around, handling

  i = (T) j;
  if (i == 0)
    ... use of j

instead of

  i = (T) j;
  if (j == 0)
    ... use of i

the calloc.C testcase is helped by handling that on the jump-threading
path but it looks backward to do that only there but not in
record_equivalences_from_incoming_edge.  Doin this in
record_equivalences_from_incoming_edge for example simplifies

int foo (int i)
{
  unsigned j = i;
  if (i == 0)
    return j + 1;
  return 0;
}

as expected.

Richard.

> Thanks,
> Richard.
> 
> > > I think we're simply missing a pass that can "properly" deal
> > > with this kind of stuff.  For example DOM could re-visit
> > > stmts which have uses of SSA names we added temporary
> > > equivalences for (ones that dominate the current block,
> > > others we'll (re-)visit anyway).  That would fix this testcase
> > > but to catch secondary effects you'd need to re-visit uses of
> > > the defs of the revisited stmts as well (if anything interesting
> > > was produced, of course).
> > This problem feels a bit like it's better handled by an optimizer 
> > independent
> > of DOM.  Probably a backwards IL walking context sensitive optimizer.  I 
> > want
> > to do something like that for threading, but haven't much pondered other use
> > cases for that kind of structure.
> > 
> > If you could create a BZ and include the patches you're playing with and a
> > reference to the failing test (calloc.C for -m32), it'd be appreciated.
> > 
> > Jeff
> > 
> > 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Reply via email to