On Fri, Sep 18, 2015 at 5:37 PM, Jeff Law <l...@redhat.com> wrote: > On 09/18/2015 03:13 AM, Richard Biener wrote: >> >> On Thu, Sep 17, 2015 at 5:58 PM, Simon Dardis <simon.dar...@imgtec.com> >> wrote: >>> >>> I've since taken another look at this recently and I've tracked the issue >>> down to >>> tree-predcom.c, specifically ref_at_iteration almost always generating >>> MEM_REFs. >>> With MEM_REFs, GCC's RTL GCSE cannot compare them as equal and hence >>> remove them. A previous version of the code did generate ARRAY_REFs >>> (pre 204458), but that was changed to generate MEM_REFs for pr/58653. >> >> >> Ok, so the issue is the full redundancy in >> >> <bb 5>: >> # count_28 = PHI <0(4), count_23(7)> >> if (pretmp_42 > 1.0e+0) >> goto <bb 6>; >> else >> goto <bb 9>; >> >> <bb 6>: >> _8 = 1.0e+0 / pretmp_42; >> _12 = _8 * _8; >> poly[1] = _12; >> >> <bb 7>: >> # prephitmp_30 = PHI <_12(6), _36(9)> >> # T_lsm.8_22 = PHI <_8(6), pretmp_42(9)> >> poly_I_lsm0.10_38 = MEM[(double *)&poly + 8B]; >> ... >> if (count_23 != iterations_6(D)) >> goto <bb 5>; >> else >> goto <bb 8>; >> >> <bb 9>: >> _36 = pretmp_42 * pretmp_42; >> poly[1] = _36; >> goto <bb 7>; >> >> ? >> >> On x86_64 I have only one load from poly[1] left in cprop1, the one >> using the MEM_REF. It's >> >> (insn 31 30 32 6 (set (reg:DF 93 [ poly_I_lsm0.10 ]) >> (mem/c:DF (const:DI (plus:DI (symbol_ref:DI ("poly") [flags >> 0x2] <var_decl 0x7f6eab532e10 poly>) >> (const_int 8 [0x8]))) [1 MEM[(double *)&poly + >> 8B]+0 S8 A64])) 126 {*movdf_internal} >> (nil)) >> >> vs. >> >> (insn 67 65 5 8 (set (mem/c:DF (const:DI (plus:DI (symbol_ref:DI >> ("poly") [flags 0x2] <var_decl 0x7f6eab532e10 poly>) >> (const_int 8 [0x8]))) [1 poly+8 S8 A64]) >> (reg:DF 90)) t.c:22 126 {*movdf_internal} >> >> for the partial redundancy across the backedge. I don't see why GCSE >> should even care for the MEM_EXPR >> here! >> >>> Would something like: >>> --- a/gcc/tree-predcom.c >>> +++ b/gcc/tree-predcom.c >>> @@ -1409,7 +1409,21 @@ ref_at_iteration (data_reference_p dr, int iter, >>> gimple_seq *stmts) >>> addr, alias_ptr), >>> DECL_SIZE (field), bitsize_zero_node); >>> } >>> - else >>> + /* Generate an ARRAY_REF for array references when all details are >>> INTEGER_CST >>> + rather than a MEM_REF so that CSE passes can potientially optimize >>> them. */ >>> + else if (TREE_CODE (DR_REF (dr)) == ARRAY_REF >>> + && TREE_CODE (DR_STEP (dr)) == INTEGER_CST >>> + && TREE_CODE (DR_INIT (dr)) == INTEGER_CST >>> + && TREE_CODE (DR_OFFSET (dr)) == INTEGER_CST) >>> + { >>> + /* Reverse engineer the element from memory offset. */ >>> + tree offset = size_binop (MINUS_EXPR, coff, off); >>> + tree sizdiv = TYPE_SIZE (TREE_TYPE (TREE_TYPE (DR_BASE_OBJECT >>> (dr)))); >>> + sizdiv = div_if_zero_remainder (EXACT_DIV_EXPR, sizdiv, ssize_int >>> (BITS_PER_UNIT)); >>> + tree element = div_if_zero_remainder (EXACT_DIV_EXPR, offset, >>> sizdiv); >>> + if (element != NULL_TREE) >>> + return build4 (ARRAY_REF, TREE_TYPE (DR_REF (dr)), >>> DR_BASE_OBJECT (dr), >>> + element, NULL_TREE, NULL_TREE); >>> + } >>> return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr, >>> alias_ptr); >>> >>> be an appropriate start to fixing this? That fix appears to work in in my >>> testing. >> >> >> No, that seems to be the wrong fix and in the light of wrong-code >> issues we had with the middle-end >> generating ARRAY_REFs even more so. >> >> What we can do and what would maybe help this case is in >> ref_at_iteration detect the DR_STEP == 0 case >> and simply return unshare_expr (DR_REF (dr)) in that case. Well, iff >> you don't want to explore the >> GCSE issue... >> >> Index: gcc/tree-predcom.c >> =================================================================== >> --- gcc/tree-predcom.c (revision 227899) >> +++ gcc/tree-predcom.c (working copy) >> @@ -1386,6 +1386,10 @@ replace_ref_with (gimple stmt, tree new_ >> static tree >> ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts) >> { >> + /* If the reference doesn't change just return it. */ >> + if (integer_zerop (DR_STEP (dr))) >> + return unshare_expr (DR_REF (dr)); >> + >> tree off = DR_OFFSET (dr); >> tree coff = DR_INIT (dr); >> if (iter == 0) >> >> hmm, no, then we end up with >> >> # prephitmp_30 = PHI <_12(6), _36(9)> >> # T_lsm.8_22 = PHI <_8(6), pretmp_42(9)> >> poly_I_lsm0.10_38 = poly[1]; >> _24 = MEM[(double *)&poly + 8B]; >> >> beause DOM has a similar issue preventing it from CSEing those two. > > Presumably you're referring to the last two statements? If so, yes, I > wouldn't expect DOM to handle that. I don't think its hasher is prepared to > try and hash those two RHS expressions to the same value.
Yeah. At some point I thought about re-using the SCCVN IL for references in DOM but I never got to do that... Richard. > jeff >