On November 23, 2015 4:37:18 PM GMT+01:00, Tom de Vries <tom_devr...@mentor.com> wrote: >On 23/11/15 12:31, Richard Biener wrote: >>>> From the dump below I understand you want no memory references in >>>> > >the outer loop? >>>> > >So the issue seems to be that store motion fails >>>> > >to insert the preheader load / exit store to the outermost loop >>>> > >possible and thus another LIM pass is needed to "store motion" >those >>>> > >again? >>> > >>> >Yep. >>> > >>>> > > But a simple testcase >>>> > > >>>> > >int a; >>>> > >int *p = &a; >>>> > >int foo (int n) >>>> > >{ >>>> > > for (int i = 0; i < n; ++i) >>>> > > for (int j = 0; j < 100; ++j) >>>> > > *p += j + i; >>>> > > return a; >>>> > >} >>>> > > >>>> > >shows that LIM can do this in one step. >>> > >>> >I've filed a FTR PR68465 - "pass_lim doesn't detect identical loop >entry >>> >conditions" for a test-case where that doesn't happen (when using >>> >-fno-tree-dominator-opts). >>> > >>>> > >Which means it should >>>> > >be investigated why it doesn't do this properly for your >testcase >>>> > >(store motion of *_25). >>> > >>> >There seems to be two related problems: >>> >1. the store has tree_could_trap_p (ref->mem.ref) true, which >should be >>> > false. I'll work on a fix for this. >>> >2. Give that the store can trap, I was running into PR68465. I >managed >>> > to eliminate the 2nd pass_lim by moving the pass_dominator >instance >>> > before the pass_lim instance. >>> > >>> >Attached patch shows the pass group with only one pass_lim. I hope >to be able >>> >to eliminate the first pass_dominator instance before pass_lim once >I fix 1. >>> > >>>> > >Simply adding two LIM passes either papers over a wrong-code >>>> > >bug (in LIM or in DOM) or over a missed-optimization in LIM. >>> > >>> >AFAIU now, it's PR68465, a missed optimization in LIM. >> Ok, it's not really LIMs job to cleanup loop header copying that way. >> >> DOM performs jump-threading for this but FRE should also be able >> to handle this just fine. Ah, it doesn't because the outer loop >> header directly contains the condition >> >> Index: gcc/tree-ssa-sccvn.c >> =================================================================== >> --- gcc/tree-ssa-sccvn.c (revision 230737) >> +++ gcc/tree-ssa-sccvn.c (working copy) >> @@ -4357,20 +4402,32 @@ sccvn_dom_walker::before_dom_children (b >> >> /* If we have a single predecessor record the equivalence from a >> possible condition on the predecessor edge. */ >> - if (single_pred_p (bb)) >> + edge pred_e = NULL; >> + FOR_EACH_EDGE (e, ei, bb->preds) >> + { >> + if (e->flags & EDGE_DFS_BACK) >> + continue; >> + if (! pred_e) >> + pred_e = e; >> + else >> + { >> + pred_e = NULL; >> + break; >> + } >> + } >> + if (pred_e) >> { >> - edge e = single_pred_edge (bb); >> /* Check if there are multiple executable successor edges in >> the source block. Otherwise there is no additional info >> to be recorded. */ >> edge e2; >> - FOR_EACH_EDGE (e2, ei, e->src->succs) >> - if (e2 != e >> + FOR_EACH_EDGE (e2, ei, pred_e->src->succs) >> + if (e2 != pred_e >> && e2->flags & EDGE_EXECUTABLE) >> break; >> if (e2 && (e2->flags & EDGE_EXECUTABLE)) >> { >> - gimple *stmt = last_stmt (e->src); >> + gimple *stmt = last_stmt (pred_e->src); >> if (stmt >> && gimple_code (stmt) == GIMPLE_COND) >> { >> @@ -4378,11 +4435,11 @@ sccvn_dom_walker::before_dom_children (b >> tree lhs = gimple_cond_lhs (stmt); >> tree rhs = gimple_cond_rhs (stmt); >> record_conds (bb, code, lhs, rhs, >> - (e->flags & EDGE_TRUE_VALUE) != 0); >> + (pred_e->flags & EDGE_TRUE_VALUE) != 0); >> code = invert_tree_comparison (code, HONOR_NANS >(lhs)); >> if (code != ERROR_MARK) >> record_conds (bb, code, lhs, rhs, >> - (e->flags & EDGE_TRUE_VALUE) == 0); >> + (pred_e->flags & EDGE_TRUE_VALUE) == >0); >> } >> } >> } >> >> fixes this for me (for a small testcase). Does it help yours? >> > >Yes, it has the desired effect (of not needing pass_dominator before >pass_lim) . But, patch "Mark by_ref mem_ref in build_receiver_ref as >non-trapping" committed as r230738, also has that effect, so AFAIU I >don't require this tree-ssa-sccvn.c fix.
OK, I committed it anyway already. Richard. >Thanks, >- Tom > >> Otherwise untested of course (I hope EDGE_DFS_BACK is good enough, >> it's supposed to match edges that have the src dominated by the >dest). >> Testing the above now.