On Fri, Sep 18, 2015 at 5:37 PM, Jeff Law <l...@redhat.com> wrote:
> On 09/18/2015 03:13 AM, Richard Biener wrote:
>>
>> On Thu, Sep 17, 2015 at 5:58 PM, Simon Dardis <simon.dar...@imgtec.com>
>> wrote:
>>>
>>> I've since taken another look at this recently and I've tracked the issue
>>> down to
>>> tree-predcom.c, specifically ref_at_iteration almost always generating
>>> MEM_REFs.
>>> With MEM_REFs, GCC's RTL GCSE cannot compare them as equal and hence
>>> remove them. A previous version of the code did generate ARRAY_REFs
>>> (pre 204458), but that was changed to generate MEM_REFs for pr/58653.
>>
>>
>> Ok, so the issue is the full redundancy in
>>
>>    <bb 5>:
>>    # count_28 = PHI <0(4), count_23(7)>
>>    if (pretmp_42 > 1.0e+0)
>>      goto <bb 6>;
>>    else
>>      goto <bb 9>;
>>
>>    <bb 6>:
>>    _8 = 1.0e+0 / pretmp_42;
>>    _12 = _8 * _8;
>>    poly[1] = _12;
>>
>>    <bb 7>:
>>    # prephitmp_30 = PHI <_12(6), _36(9)>
>>    # T_lsm.8_22 = PHI <_8(6), pretmp_42(9)>
>>    poly_I_lsm0.10_38 = MEM[(double *)&poly + 8B];
>> ...
>>    if (count_23 != iterations_6(D))
>>      goto <bb 5>;
>>    else
>>      goto <bb 8>;
>>
>>    <bb 9>:
>>    _36 = pretmp_42 * pretmp_42;
>>    poly[1] = _36;
>>    goto <bb 7>;
>>
>> ?
>>
>> On x86_64 I have only one load from poly[1] left in cprop1, the one
>> using the MEM_REF.  It's
>>
>> (insn 31 30 32 6 (set (reg:DF 93 [ poly_I_lsm0.10 ])
>>          (mem/c:DF (const:DI (plus:DI (symbol_ref:DI ("poly") [flags
>> 0x2]  <var_decl 0x7f6eab532e10 poly>)
>>                      (const_int 8 [0x8]))) [1 MEM[(double *)&poly +
>> 8B]+0 S8 A64])) 126 {*movdf_internal}
>>       (nil))
>>
>> vs.
>>
>> (insn 67 65 5 8 (set (mem/c:DF (const:DI (plus:DI (symbol_ref:DI
>> ("poly") [flags 0x2]  <var_decl 0x7f6eab532e10 poly>)
>>                      (const_int 8 [0x8]))) [1 poly+8 S8 A64])
>>          (reg:DF 90)) t.c:22 126 {*movdf_internal}
>>
>> for the partial redundancy across the backedge.  I don't see why GCSE
>> should even care for the MEM_EXPR
>> here!
>>
>>> Would something like:
>>> --- a/gcc/tree-predcom.c
>>> +++ b/gcc/tree-predcom.c
>>> @@ -1409,7 +1409,21 @@ ref_at_iteration (data_reference_p dr, int iter,
>>> gimple_seq *stmts)
>>>                               addr, alias_ptr),
>>>                       DECL_SIZE (field), bitsize_zero_node);
>>>       }
>>> -  else
>>> +  /* Generate an ARRAY_REF for array references when all details are
>>> INTEGER_CST
>>> +     rather than a MEM_REF so that CSE passes can potientially optimize
>>> them.  */
>>> +  else if (TREE_CODE (DR_REF (dr)) == ARRAY_REF
>>> +          && TREE_CODE (DR_STEP (dr)) == INTEGER_CST
>>> +          && TREE_CODE (DR_INIT (dr)) == INTEGER_CST
>>> +          && TREE_CODE (DR_OFFSET (dr)) == INTEGER_CST)
>>> +    {
>>> +       /* Reverse engineer the element from memory offset.  */
>>> +       tree offset = size_binop (MINUS_EXPR, coff, off);
>>> +       tree sizdiv = TYPE_SIZE (TREE_TYPE (TREE_TYPE (DR_BASE_OBJECT
>>> (dr))));
>>> +       sizdiv = div_if_zero_remainder (EXACT_DIV_EXPR, sizdiv, ssize_int
>>> (BITS_PER_UNIT));
>>> +       tree element = div_if_zero_remainder (EXACT_DIV_EXPR, offset,
>>> sizdiv);
>>> +       if (element != NULL_TREE)
>>> +         return build4 (ARRAY_REF, TREE_TYPE (DR_REF (dr)),
>>> DR_BASE_OBJECT (dr),
>>> +                        element, NULL_TREE, NULL_TREE);
>>> +    }
>>>       return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr,
>>> alias_ptr);
>>>
>>> be an appropriate start to fixing this? That fix appears to work in in my
>>> testing.
>>
>>
>> No, that seems to be the wrong fix and in the light of wrong-code
>> issues we had with the middle-end
>> generating ARRAY_REFs even more so.
>>
>> What we can do and what would maybe help this case is in
>> ref_at_iteration detect the DR_STEP == 0 case
>> and simply return unshare_expr (DR_REF (dr)) in that case.  Well, iff
>> you don't want to explore the
>> GCSE issue...
>>
>> Index: gcc/tree-predcom.c
>> ===================================================================
>> --- gcc/tree-predcom.c  (revision 227899)
>> +++ gcc/tree-predcom.c  (working copy)
>> @@ -1386,6 +1386,10 @@ replace_ref_with (gimple stmt, tree new_
>>   static tree
>>   ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
>>   {
>> +  /* If the reference doesn't change just return it.  */
>> +  if (integer_zerop (DR_STEP (dr)))
>> +    return unshare_expr (DR_REF (dr));
>> +
>>     tree off = DR_OFFSET (dr);
>>     tree coff = DR_INIT (dr);
>>     if (iter == 0)
>>
>> hmm, no, then we end up with
>>
>>    # prephitmp_30 = PHI <_12(6), _36(9)>
>>    # T_lsm.8_22 = PHI <_8(6), pretmp_42(9)>
>>    poly_I_lsm0.10_38 = poly[1];
>>    _24 = MEM[(double *)&poly + 8B];
>>
>> beause DOM has a similar issue preventing it from CSEing those two.
>
> Presumably you're referring to the last two statements?  If so, yes, I
> wouldn't expect DOM to handle that.  I don't think its hasher is prepared to
> try and hash those two RHS expressions to the same value.

Yeah.  At some point I thought about re-using the SCCVN IL for references in DOM
but I never got to do that...

Richard.

> jeff
>

Reply via email to