> Am 25.11.2024 um 16:34 schrieb Martin Uecker <uec...@tugraz.at>:
>
> Am Montag, dem 25.11.2024 um 10:35 +0100 schrieb Richard Biener:
>>> On Mon, 25 Nov 2024, Richard Biener wrote:
>>>
>>>> On Sat, 23 Nov 2024, Martin Uecker wrote:
>>>
>>>>
>>>> This patch tries fixes the errors we have because of
>>>> flexible array members. I am bit unsure about the exception
>>>> for the mode.
>>>>
>>>> Bootstrapped and regression tested on x86_64.
>>>>
>>>>
>>>>
>>>> Fix type compatibility for types with flexible array member
>>>> [PR113688,PR114014,PR117724]
>>>>
>>>> verify_type checks the compatibility of TYPE_CANONICAL using
>>>> gimple_canonical_types_compatible_p. But it is stricter than what the
>>>> C standard requires and therefor inconsistent with how TYPE_CANONICAL
>>>> is set
>>>> in the C FE. Here, the logic is changed to ignore array size when one
>>>> of the
>>>> types is a flexible array member. To not get errors because of
>>>> inconsistent
>>>> number of members, zero-sized arrays are not ignored anymore when
>>>> checking
>>>> fields of a struct (which is stricter than what was done before).
>>>> Finally, a exception is added that allows the TYPE_MODE of a type with
>>>> flexible array member to differ from another compatible type.
>>>>
>>>> PR c/113688
>>>> PR c/114014
>>>> PR c/117724
>>>>
>>>> gcc/ChangeLog:
>>>> * tree.cc (gimple_canonical_types_compatible_p): Revise
>>>> logic for types with FAM.
>>>> (verify_type): Add exception for mode for types with FAM.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> * gcc.dg/pr113688.c: New test.
>>>> * gcc.dg/pr114014.c: New test.
>>>> * gcc.dg/pr117724.c: New test.
>>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/pr113688.c
>>>> b/gcc/testsuite/gcc.dg/pr113688.c
>>>> new file mode 100644
>>>> index 00000000000..8dee8c86f1b
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr113688.c
>>>> @@ -0,0 +1,8 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-g" } */
>>>> +
>>>> +struct S{int x,y[1];}*a;
>>>> +int main(void){
>>>> + struct S{int x,y[];};
>>>> +}
>>>> +
>>>> diff --git a/gcc/testsuite/gcc.dg/pr114014.c
>>>> b/gcc/testsuite/gcc.dg/pr114014.c
>>>> new file mode 100644
>>>> index 00000000000..ab783f4f85d
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr114014.c
>>>> @@ -0,0 +1,14 @@
>>>> +/* PR c/114014
>>>> + * { dg-do compile }
>>>> + * { dg-options "-std=c23 -g" } */
>>>> +
>>>> +struct r {
>>>> + int a;
>>>> + char b[];
>>>> +};
>>>> +struct r {
>>>> + int a;
>>>> + char b[0];
>>>> +}; /* { dg-error "redefinition" } */
>>>> +
>>>> +
>>>> diff --git a/gcc/testsuite/gcc.dg/pr117724.c
>>>> b/gcc/testsuite/gcc.dg/pr117724.c
>>>> new file mode 100644
>>>> index 00000000000..d631daeb644
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr117724.c
>>>> @@ -0,0 +1,16 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-g" } */
>>>> +
>>>> +struct {
>>>> + unsigned long len;
>>>> + unsigned long size;
>>>> + char data[];
>>>> +}; /* { dg-warning "unnamed struct" } */
>>>> +struct {
>>>> + struct {
>>>> + unsigned long len;
>>>> + unsigned long size;
>>>> + char data[6];
>>>> + };
>>>> +}; /* { dg-warning "unnamed struct" } */
>>>> +
>>>> diff --git a/gcc/tree.cc b/gcc/tree.cc
>>>> index 1da06c7d4e9..dbf6b180496 100644
>>>> --- a/gcc/tree.cc
>>>> +++ b/gcc/tree.cc
>>>> @@ -13900,8 +13900,11 @@ gimple_canonical_types_compatible_p (const_tree
>>>> t1, const_tree t2,
>>>> || TREE_CODE (t1) == NULLPTR_TYPE)
>>>> return true;
>>>>
>>>> - /* Can't be the same type if they have different mode. */
>>>> - if (TYPE_MODE (t1) != TYPE_MODE (t2))
>>>> + /* Can't be compatible types if they have different mode. We allow
>>>> + mismatching modes for types with flexible array member. */
>>>> + if (!flexible_array_type_p (t1)
>>>> + && !flexible_array_type_p (t2)
>>>
>>> These are quite expensive (recursive on UNION, walking TYPE_FIELDs for
>>> RECORD) and not handling ARRAY_TYPE itself.
>>>
>>> I'll also note that same TYPE_CANONICAL implies same TYPE_SIZE given
>>> for aggregate assignment we assume we can take the copy size from
>>> either side.
>
> This should be fine if there is nothing in the middle end is
> creating such assignments.
>
> The C FE should not create such assignments, because it checks
> compatibility of types using the stricter non-transitive rule from
> ISO C and not the relaxed compatibility that is used for assigning
> TYPE_CANONICAL (which is relaxed to be transitive to be able to
> form equivalence classes).
>
>>>
>>> So I wonder if instead
>>>
>>> COMPLETE_TYPE_P (t1)
>>> && COMPLETE_TYPE_P (t2)
>>>
>>> should be used here.
>>>
>
> I don't think this works, because the structs with FAM are also
> complete types (yes, it would make more sense if they were not).
Uh, OK … nevertheless the function is expensive to use. I don’t understand why
the flex array containing type is complete though…
>
>>> I assume struct { int len; char data[1]; } and
>>> struct { int len; char data[2]; } are not compatible but both can
>>> have TYPE_CANONICAL of struct {int len; char data[]; }?
>
> Yes. At the moment, any of the three types could end up the
> TYPE_CANONICAL depending on which is encountered first.
>
>>>
>>>> + && (TYPE_MODE (t1) != TYPE_MODE (t2)))
>>>> return false;
>>>>
>>>> /* Non-aggregate types can be handled cheaply. */
>>>> @@ -13952,7 +13955,7 @@ gimple_canonical_types_compatible_p (const_tree
>>>> t1, const_tree t2,
>>>> {
>>>> case ARRAY_TYPE:
>>>> /* Array types are the same if the element types are the same and
>>>> - the number of elements are the same. */
>>>> + minimum and maximum index are the same. */
>>>> if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1), TREE_TYPE
>>>> (t2),
>>>> trust_type_canonical)
>>>> || TYPE_STRING_FLAG (t1) != TYPE_STRING_FLAG (t2)
>>>> @@ -14046,23 +14049,35 @@ gimple_canonical_types_compatible_p (const_tree
>>>> t1, const_tree t2,
>>>> f1 || f2;
>>>> f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2))
>>>> {
>>>> - /* Skip non-fields and zero-sized fields. */
>>>> - while (f1 && (TREE_CODE (f1) != FIELD_DECL
>>>> - || (DECL_SIZE (f1)
>>>> - && integer_zerop (DECL_SIZE (f1)))))
>>>> + /* Skip non-fields. */
>>>> + while (f1 && (TREE_CODE (f1) != FIELD_DECL))
>>>> f1 = TREE_CHAIN (f1);
>>>> - while (f2 && (TREE_CODE (f2) != FIELD_DECL
>>>> - || (DECL_SIZE (f2)
>>>> - && integer_zerop (DECL_SIZE (f2)))))
>>>> + while (f2 && (TREE_CODE (f2) != FIELD_DECL))
>>>> f2 = TREE_CHAIN (f2);
>>>
>>> The above assumes zero-sized fields are last (I guess an OK assumption),
>>> can you put a comment after this indicating this fact?
>
> Ok.
>
>>>
>>>> if (!f1 || !f2)
>>>> break;
>>>> +
>>>> + tree t1 = TREE_TYPE (f1);
>>>> + tree t2 = TREE_TYPE (f2);
>>>> +
>>>> + /* Special case for flexible array members. */
>>>> + if (TREE_CHAIN (f1) == NULL_TREE
>>>> + && TREE_CHAIN (f2) == NULL_TREE
>>>> + && TREE_CODE (t1) == ARRAY_TYPE
>>>> + && TREE_CODE (t2) == ARRAY_TYPE
>>>> + && (!DECL_NOT_FLEXARRAY (f1)
>>>> + || !DECL_NOT_FLEXARRAY (f2))
>>>> + && TYPE_REVERSE_STORAGE_ORDER (t1) == TYPE_REVERSE_STORAGE_ORDER
>>>> (t2)
>>>> + && TYPE_NONALIASED_COMPONENT (t1) == TYPE_NONALIASED_COMPONENT
>>>> (t2)
>>>> + && gimple_canonical_types_compatible_p
>>>> + (TREE_TYPE (t1), TREE_TYPE (t2),
>>>> + trust_type_canonical))
>>>> + ;
>>>> /* The fields must have the same name, offset and type. */
>>>> - if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
>>>> + else if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
>>>> || !gimple_compare_field_offset (f1, f2)
>>>> || !gimple_canonical_types_compatible_p
>>>> - (TREE_TYPE (f1), TREE_TYPE (f2),
>>>> - trust_type_canonical))
>>>> + (t1, t2, trust_type_canonical))
>>>
>>> So what was actually mismatching with the old code? Why can you spare
>>> comparing the start of the array member (gimple_compare_field_offset)?
Why can you spare comparing the offset?
>>> Should the special-casing be only to not compare the array type itself
>>> but its element type? What about int[][] (not sure if there's such
>>> thing in C, an array type of array type with flexible size)?
>
> int[] would be an incomplete array which is not allowed as an
> element type for an array.
>
>>>
>>>> return false;
>>>> }
>>>>
>>>> @@ -14206,6 +14221,9 @@ verify_type (const_tree t)
>>>> }
>>>>
>>>> if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
>>>> + /* We allow a mismatch for flexible array members. */
>>>> + && !flexible_array_type_p (t)
>>>> + && !flexible_array_type_p (TYPE_CANONICAL (t))
>>>
>>> Same as above, COMPLETE_TYPE_P (TYPE_CANONICAL (t))?
>>
>> One more thing - for LTO we throw away TYPE_CANONICAL and re-compute
>> it in a supposedly more conservative way than frontends. This happens
>> together with this function and lto/lto-common.cc:
>> iterative_hash_canonical_type and gimple_register_canonical_type
>> which hashes TYPE_MODE - it would need to stop doing that and it
>> would have an issue with non-transitiveness of canonicality since
>> it takes the "first" struct as the canonical one, not necessarily
>> the !COMPLETE_TYPE_P one (which could be from another TU).
>
> And if we just always ignore the size for an array that comes
> last? (and also remove TYPE_MODE from hashing). Then it could
> continue to use this function to build equivalence classes. It
> might not pick the one with the flex member as canonical (just
> like the C FE), but this should not matter anywhere, or?
>
> If this is not completely wrong I would try this first.
It feels a bit odd, but if non-completeness doesn’t work …
>>
>> So - this might not fly with LTO in the end, unless the C fronted
>> would build a flexarray canonical type for each record type with
>> an array type at the end and make the flexarray variant canonical
>> (in the expectation of another TU containing such).
>
> This would also be possible but is it necessary that the TYPE_CANONICAL
> is the one with the flexarray? This would usually require creating
> a new type specifically for this.
I think it would be the only way to make the non-completeness check work.
I wonder if we should spare a middle-end bit to indicate whether a struct needs
flexarray handling?
> Martin
>
>>
>> Richard.
>>
>>>> && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
>>>> {
>>>> error ("%<TYPE_MODE%> of %<TYPE_CANONICAL%> is not compatible");
>>>>
>>>>
>>>>
>>>
>>>
>>
>