Re: Fix type compatibility for types with flexible array member [PR113688, PR114014, PR117724]

Richard Biener Mon, 25 Nov 2024 07:45:41 -0800

> Am 25.11.2024 um 16:34 schrieb Martin Uecker <uec...@tugraz.at>:
> 
> Am Montag, dem 25.11.2024 um 10:35 +0100 schrieb Richard Biener:
>>> On Mon, 25 Nov 2024, Richard Biener wrote:
>>> 
>>>> On Sat, 23 Nov 2024, Martin Uecker wrote:
>>> 
>>>> 
>>>> This patch tries fixes the errors we have because of
>>>> flexible array members.  I am bit unsure about the exception
>>>> for the mode.
>>>> 
>>>> Bootstrapped and regression tested on x86_64.
>>>> 
>>>> 
>>>> 
>>>>    Fix type compatibility for types with flexible array member 
>>>> [PR113688,PR114014,PR117724]
>>>> 
>>>>    verify_type checks the compatibility of TYPE_CANONICAL using
>>>>    gimple_canonical_types_compatible_p.   But it is stricter than what the
>>>>    C standard requires and therefor inconsistent with how TYPE_CANONICAL 
>>>> is set
>>>>    in the C FE.  Here, the logic is changed to ignore array size when one 
>>>> of the
>>>>    types is a flexible array member.  To not get errors because of 
>>>> inconsistent
>>>>    number of members, zero-sized arrays are not ignored anymore when 
>>>> checking
>>>>    fields of a struct (which is stricter than what was done before).
>>>>    Finally, a exception is added that allows the TYPE_MODE of a type with
>>>>    flexible array member to differ from another compatible type.
>>>> 
>>>>            PR c/113688
>>>>            PR c/114014
>>>>            PR c/117724
>>>> 
>>>>    gcc/ChangeLog:
>>>>            * tree.cc (gimple_canonical_types_compatible_p): Revise
>>>>            logic for types with FAM.
>>>>            (verify_type): Add exception for mode for types with FAM.
>>>> 
>>>>    gcc/testsuite/ChangeLog:
>>>>            * gcc.dg/pr113688.c: New test.
>>>>            * gcc.dg/pr114014.c: New test.
>>>>            * gcc.dg/pr117724.c: New test.
>>>> 
>>>> diff --git a/gcc/testsuite/gcc.dg/pr113688.c 
>>>> b/gcc/testsuite/gcc.dg/pr113688.c
>>>> new file mode 100644
>>>> index 00000000000..8dee8c86f1b
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr113688.c
>>>> @@ -0,0 +1,8 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-g" } */
>>>> +
>>>> +struct S{int x,y[1];}*a;
>>>> +int main(void){
>>>> +    struct S{int x,y[];};
>>>> +}
>>>> +
>>>> diff --git a/gcc/testsuite/gcc.dg/pr114014.c 
>>>> b/gcc/testsuite/gcc.dg/pr114014.c
>>>> new file mode 100644
>>>> index 00000000000..ab783f4f85d
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr114014.c
>>>> @@ -0,0 +1,14 @@
>>>> +/* PR c/114014
>>>> + * { dg-do compile }
>>>> + * { dg-options "-std=c23 -g" } */
>>>> +
>>>> +struct r {
>>>> +  int a;
>>>> +  char b[];
>>>> +};
>>>> +struct r {
>>>> +  int a;
>>>> +  char b[0];
>>>> +};    /* { dg-error "redefinition" } */
>>>> +
>>>> +
>>>> diff --git a/gcc/testsuite/gcc.dg/pr117724.c 
>>>> b/gcc/testsuite/gcc.dg/pr117724.c
>>>> new file mode 100644
>>>> index 00000000000..d631daeb644
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr117724.c
>>>> @@ -0,0 +1,16 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-g" } */
>>>> +
>>>> +struct {
>>>> +  unsigned long len;
>>>> +  unsigned long size;
>>>> +  char data[];
>>>> +};            /* { dg-warning "unnamed struct" } */
>>>> +struct {
>>>> +  struct {
>>>> +    unsigned long len;
>>>> +    unsigned long size;
>>>> +    char data[6];
>>>> +  };
>>>> +};            /* { dg-warning "unnamed struct" } */
>>>> +
>>>> diff --git a/gcc/tree.cc b/gcc/tree.cc
>>>> index 1da06c7d4e9..dbf6b180496 100644
>>>> --- a/gcc/tree.cc
>>>> +++ b/gcc/tree.cc
>>>> @@ -13900,8 +13900,11 @@ gimple_canonical_types_compatible_p (const_tree 
>>>> t1, const_tree t2,
>>>>       || TREE_CODE (t1) == NULLPTR_TYPE)
>>>>     return true;
>>>> 
>>>> -  /* Can't be the same type if they have different mode.  */
>>>> -  if (TYPE_MODE (t1) != TYPE_MODE (t2))
>>>> +  /* Can't be compatible types if they have different mode.  We allow
>>>> +     mismatching modes for types with flexible array member.  */
>>>> +  if (!flexible_array_type_p (t1)
>>>> +      && !flexible_array_type_p (t2)
>>> 
>>> These are quite expensive (recursive on UNION, walking TYPE_FIELDs for
>>> RECORD) and not handling ARRAY_TYPE itself.
>>> 
>>> I'll also note that same TYPE_CANONICAL implies same TYPE_SIZE given
>>> for aggregate assignment we assume we can take the copy size from
>>> either side.
> 
> This should be fine if there is nothing in the middle end is
> creating  such assignments.
> 
> The C FE should not create such assignments, because it checks
> compatibility of types using the stricter non-transitive rule from 
> ISO C and not the relaxed compatibility that is used for assigning
> TYPE_CANONICAL (which is relaxed to be transitive to be able to
> form equivalence classes).
> 
>>> 
>>> So I wonder if instead
>>> 
>>>        COMPLETE_TYPE_P (t1)
>>>        && COMPLETE_TYPE_P (t2)
>>> 
>>> should be used here.  
>>> 
> 
> I don't think this works, because the structs with FAM are also
> complete types (yes, it would make more sense if they were not).

Uh, OK … nevertheless the function is expensive to use.  I don’t understand why 
the flex array containing type is complete though…

> 
>>> I assume struct { int len; char data[1]; } and
>>> struct { int len; char data[2]; } are not compatible but both can
>>> have TYPE_CANONICAL of struct {int len; char data[]; }?
> 
> Yes. At the moment, any of the three types could end up the 
> TYPE_CANONICAL depending on which is encountered first.
> 
>>> 
>>>> +      && (TYPE_MODE (t1) != TYPE_MODE (t2)))
>>>>     return false;
>>>> 
>>>>   /* Non-aggregate types can be handled cheaply.  */
>>>> @@ -13952,7 +13955,7 @@ gimple_canonical_types_compatible_p (const_tree 
>>>> t1, const_tree t2,
>>>>     {
>>>>     case ARRAY_TYPE:
>>>>       /* Array types are the same if the element types are the same and
>>>> -     the number of elements are the same.  */
>>>> +     minimum and maximum index are the same.  */
>>>>       if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1), TREE_TYPE 
>>>> (t2),
>>>>                        trust_type_canonical)
>>>>      || TYPE_STRING_FLAG (t1) != TYPE_STRING_FLAG (t2)
>>>> @@ -14046,23 +14049,35 @@ gimple_canonical_types_compatible_p (const_tree 
>>>> t1, const_tree t2,
>>>>         f1 || f2;
>>>>         f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2))
>>>>      {
>>>> -        /* Skip non-fields and zero-sized fields.  */
>>>> -        while (f1 && (TREE_CODE (f1) != FIELD_DECL
>>>> -              || (DECL_SIZE (f1)
>>>> -                  && integer_zerop (DECL_SIZE (f1)))))
>>>> +        /* Skip non-fields.  */
>>>> +        while (f1 && (TREE_CODE (f1) != FIELD_DECL))
>>>>          f1 = TREE_CHAIN (f1);
>>>> -        while (f2 && (TREE_CODE (f2) != FIELD_DECL
>>>> -              || (DECL_SIZE (f2)
>>>> -                  && integer_zerop (DECL_SIZE (f2)))))
>>>> +        while (f2 && (TREE_CODE (f2) != FIELD_DECL))
>>>>          f2 = TREE_CHAIN (f2);
>>> 
>>> The above assumes zero-sized fields are last (I guess an OK assumption),
>>> can you put a comment after this indicating this fact?
> 
> Ok.
> 
>>> 
>>>>        if (!f1 || !f2)
>>>>          break;
>>>> +
>>>> +        tree t1 = TREE_TYPE (f1);
>>>> +        tree t2 = TREE_TYPE (f2);
>>>> +
>>>> +        /* Special case for flexible array members.  */
>>>> +        if (TREE_CHAIN (f1) == NULL_TREE
>>>> +        && TREE_CHAIN (f2) == NULL_TREE
>>>> +        && TREE_CODE (t1) == ARRAY_TYPE
>>>> +        && TREE_CODE (t2) == ARRAY_TYPE
>>>> +        && (!DECL_NOT_FLEXARRAY (f1)
>>>> +            || !DECL_NOT_FLEXARRAY (f2))
>>>> +        && TYPE_REVERSE_STORAGE_ORDER (t1) == TYPE_REVERSE_STORAGE_ORDER 
>>>> (t2)
>>>> +        && TYPE_NONALIASED_COMPONENT (t1) == TYPE_NONALIASED_COMPONENT 
>>>> (t2)
>>>> +        && gimple_canonical_types_compatible_p
>>>> +            (TREE_TYPE (t1), TREE_TYPE (t2),
>>>> +             trust_type_canonical))
>>>> +          ;
>>>>        /* The fields must have the same name, offset and type.  */
>>>> -        if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
>>>> +        else if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
>>>>        || !gimple_compare_field_offset (f1, f2)
>>>>        || !gimple_canonical_types_compatible_p
>>>> -              (TREE_TYPE (f1), TREE_TYPE (f2),
>>>> -               trust_type_canonical))
>>>> +              (t1, t2, trust_type_canonical))
>>> 
>>> So what was actually mismatching with the old code?  Why can you spare
>>> comparing the start of the array member (gimple_compare_field_offset)?

Why can you spare comparing the offset?

>>> Should the special-casing be only to not compare the array type itself
>>> but its element type?  What about int[][] (not sure if there's such
>>> thing in C, an array type of array type with flexible size)?
> 
> int[] would be an incomplete array which is not allowed as an
> element type for an array.
> 
>>> 
>>>>          return false;
>>>>      }
>>>> 
>>>> @@ -14206,6 +14221,9 @@ verify_type (const_tree t)
>>>>     }
>>>> 
>>>>   if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
>>>> +      /* We allow a mismatch for flexible array members.  */
>>>> +      && !flexible_array_type_p (t)
>>>> +      && !flexible_array_type_p (TYPE_CANONICAL (t))
>>> 
>>> Same as above, COMPLETE_TYPE_P (TYPE_CANONICAL (t))?
>> 
>> One more thing - for LTO we throw away TYPE_CANONICAL and re-compute
>> it in a supposedly more conservative way than frontends.  This happens
>> together with this function and lto/lto-common.cc:
>> iterative_hash_canonical_type and gimple_register_canonical_type
>> which hashes TYPE_MODE - it would need to stop doing that and it
>> would have an issue with non-transitiveness of canonicality since
>> it takes the "first" struct as the canonical one, not necessarily
>> the !COMPLETE_TYPE_P one (which could be from another TU).
> 
> And if we just always ignore the size for an array that comes
> last? (and also remove TYPE_MODE from hashing). Then it could 
> continue to use this function to build equivalence classes. It
> might not pick the one with the flex member as canonical (just 
> like the C FE), but this should not matter anywhere, or?
> 
> If this is not completely wrong I would try this first.

It feels a bit odd, but if non-completeness doesn’t work …

>> 
>> So - this might not fly with LTO in the end, unless the C fronted
>> would build a flexarray canonical type for each record type with
>> an array type at the end and make the flexarray variant canonical
>> (in the expectation of another TU containing such).
> 
> This would also be possible but is it necessary that the TYPE_CANONICAL
> is the one with the flexarray?  This would usually require creating
> a new type specifically for this.

I think it would be the only way to make the non-completeness check work.

I wonder if we should spare a middle-end bit to indicate whether a struct needs 
flexarray handling?

> Martin
> 
>> 
>> Richard.
>> 
>>>>       && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
>>>>     {
>>>>       error ("%<TYPE_MODE%> of %<TYPE_CANONICAL%> is not compatible");
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>
Re: Fix type compatibility for types with flexible array member [PR113688, PR114014, PR117724]

Reply via email to