On Fri, Nov 24, 2017 at 12:53 PM, Martin Jambor <mjam...@suse.cz> wrote:
> Hi Richi,
>
> On Fri, Nov 24 2017, Richard Biener wrote:
>> On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener
>
> ..
>
>>>>>> And yes, I've been worried about SRA as well here...  it _does_
>>>>>> have some early outs when seeing VIEW_CONVERT_EXPR but
>>>>>> appearantly not for the above.  Testcase that aborts with SRA but
>>>>>> not without:
>>>>>>
>>>>>> struct A { short s; long i; long j; };
>>>>>> struct A a, b;
>>>>>> void foo ()
>>>>>> {
>>>>>>   struct A c;
>>>>>>   __builtin_memcpy (&c, &b, sizeof (struct A));
>>>>>>   __builtin_memcpy (&a, &c, sizeof (struct A));
>>>>>> }
>>>>>> int main()
>>>>>> {
>>>>>>   __builtin_memset (&b, 0, sizeof (struct A));
>>>>>>   b.s = 1;
>>>>>>   __builtin_memcpy ((char *)&b+2, &b, 2);
>>>>>>   foo ();
>>>>>>   __builtin_memcpy (&a, (char *)&a+2, 2);
>>>>>>   if (a.s != 1)
>>>>>>     __builtin_abort ();
>>>>>>   return 0;
>>>>>> }
>>>>>
>>>>> Thanks for the testcase, I agree that is a fairly big problem.  Do you
>>>>> think that the following (untested) patch is an appropriate way of
>>>>> fixing it and generally of extending gimple to capture that a statement
>>>>> is a bit-copy?
>>>>
>>>> I think the place to fix is the memcpy folding.  That is, we'd say that
>>>> aggregate assignments are not bit-copies but do element-wise assignments.
>>>> For memcpy folding we'd then need to use a type that doesn't contain
>>>> padding.  Which effectively means char[].
>>>>
>>>> Of course we need to stop SRA from decomposing that copy to
>>>> individual characters then ;)
>>>>
>>>> So iff we decide that all aggregate copies are element copies,
>>>> maybe only those where TYPE_MAIN_VARIANT of lhs and rhs match
>>>> (currently we allow TYPE_CANONICAL compatibility and thus there
>>>> might be some mismatches), then we have to fix nothign but
>>>> the memcpy folding.
>>>>
>>>>> If so, I'll add the testcase, bootstrap it and formally propose it.
>>>>> Subsequently I will of course make sure that any element-wise copying
>>>>> patch would test the predicate.
>>>>
>>>> I don't think the alias-set should determine whether a copy is
>>>> bit-wise or not.
>>>
>>> Like the attached.  At least FAILs
>>>
>>> FAIL: gcc.dg/tree-ssa/ssa-ccp-27.c scan-tree-dump-times ccp1
>>> "memcpy[^\n]*123456" 2 (found 0 times)
>>>
>>> not sure why we have this test.
>>
>> Hum.  And SRA still decomposes the copy to struct elements w/o padding
>> even though the access is done using char[].  So somehow it ignores
>> VIEW_CONVERT_EXPRs
>> (well, those implicitely present on MEM_REFs).
>
> Yes.  SRA is not even too afraid of top-level V_C_Es.  It really bails
> out only if they are buried under a handled_component.  And it does not
> remove aggregate assignments containing them.
>
>>
>> Looks like this is because total scalarization is done on the decls
>> and not at all
>> honoring how the variable is accessed?  The following seems to fix
>> that, otherwise untested.
>
>
>>
>> Index: gcc/tree-sra.c
>> ===================================================================
>> --- gcc/tree-sra.c      (revision 255137)
>> +++ gcc/tree-sra.c      (working copy)
>> @@ -1338,7 +1338,9 @@ build_accesses_from_assign (gimple *stmt
>>      {
>>        racc->grp_assignment_read = 1;
>>        if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
>> -         && !is_gimple_reg_type (racc->type))
>> +         && !is_gimple_reg_type (racc->type)
>> +         && (TYPE_MAIN_VARIANT (racc->type)
>> +             == TYPE_MAIN_VARIANT (TREE_TYPE (racc->base))))
>>         bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
>>        if (storage_order_barrier_p (lhs))
>>         racc->grp_unscalarizable_region = 1;
>
> I believe that the added condition is not what you want, this seems
> to trigger also for ordinary:
>
>   s1 = s2.field
>
> Where racc->type is type of the field but racc->base is s2 and its type
> is type of the structure.

Yes.  But do we want to totally scalarize s2 in this case?  We only
access parts of it.  We don't seem to have a testcase that fails
(well, full testing still in progress).

> I also think you want to be setting a bit in
> cannot_scalarize_away_bitmap in order to guarantee that total
> scalarization will not happen for the given candidate.  Otherwise some
> other regular assignment might trigger it ...

Yeah, figured that out myself.

> except if we then also
> checked the statement for bit-copying types in sra_modify_assign (in the
> condition after the big comment), which I suppose is actually the
> correct thing to do.

But modification is too late, no?

> Thanks a lot for the folding patch, I can take over the SRA bits if you
> want to.

For reference below is the full patch.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok for the SRA parts?

Thanks,
Richard.

2017-11-24  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/83141
        * gimple-fold.c (gimple_fold_builtin_memory_op): Simplify
        aggregate copy generation by always using a unsigned char[]
        type to perform the copying.
        * tree-sra.c (build_accesses_from_assign): Disqualify
        accesses in non-native type for total scalarization.

        * gcc.dg/torture/pr83141.c: New testcase.
        * gcc.dg/tree-ssa/ssa-ccp-27.c: Adjust.

Attachment: p3-2
Description: Binary data

Reply via email to