7 regression])

Jason Merrill Mon, 01 Aug 2016 07:22:34 -0700

On Sun, Jul 31, 2016 at 4:27 PM, Martin Sebor <mse...@gmail.com> wrote:
> On 07/31/2016 10:28 AM, Jason Merrill wrote:
>> On Fri, Jul 29, 2016 at 7:22 PM, Martin Sebor <mse...@gmail.com> wrote:
>>> On 07/26/2016 12:53 PM, Jason Merrill wrote:
>>>> On 07/23/2016 01:18 PM, Martin Sebor wrote:
>>>>>
>>>>> +  /* A pair of the first non-static non-empty data members following
>>>>> +     either the flexible array member, if found, or the zero-length
>>>>> +     array member otherwise.  AFTER[1] refers to the first such data
>>>>> +     member of a union that the struct containing the flexible array
>>>>> +     member or zero-length array is a member, or NULL when no such
>>>>> +     union exists.  AFTER[0] refers to the first such data member
>>>>> +     that is not a member of such a union.  */
>>>>
>>>> This is pretty hard to follow, could you add an example?  Why do we want
>>>> to track these separately rather than look at DECL_CONTEXT at diagnostic
>>>> time?
>>>
>>> Sure, I've added an example.
>>>
>>> Whether or not a given flexible array member is valid depends not
>>> only on the context in which it's defined (its enclosing class) but
>>> also on the members of other classes whose objects may be defined
>>> in the same or other contexts (e.g., enclosing structs).  I don't
>>> know of a way to reach those other members from the context of
>>> the ARRAY.
>>
>> Not from the context of the array, but when we see a field following the
>> flexible array (or aggregate ending in a flexible array), can't we look
>> at the DECL_CONTEXT of that field and see whether it's a union or not?
>
> I don't think that would work in cases like this:
>
>   struct S1 {
>     struct S2 { int i, a[]; } s2;
>     union U { int x; } u;
>   };
>
> that need to be treated differently from this one:
>
>   union U1 {
>     struct S { int i, a[]; } s;
>     union U2 { int x; } u2;
>   };


Ah, I'm thinking of the following field as u/u2 rather than x.  Why
does it improve clarity to look inside U/U2 for a following field?

>>> +     For example, in the following, the flexible array member
>>> +     S::U::X::a overlaps S::U::Y::i and so AFTER[1] is set to refer to
>>> +     the latter.  This potential problem is independent of union U's
>>> +     membership in struct S.  In addition, in the definition of struct
>>> +     S, S::U::x::a is followed by S::z, and so AFTER[0] is set to refer
>>> +     to the latter.  The two problems result in two diagnostics, the
>>> +     first one being a pedantic warning and the second a hard error.
>>> +
>>> +       struct S {
>>> +         union U {
>>> +           struct X { int i, a[]; } x;
>>> +           struct Y { long i, a[]; } y;
>>> +         } u;
>>> +        int z;
>>> +       };
>>
>>
>> Hmm, I'm not convinced that the first problem is really a problem.  Only
>> one of the union members is active at any time, so overlapping with
>> another union member seems is irrelevant.
>>
>> I also can't find the language in the C standard that prohibits nesting
>> a struct ending with a flexible array in another struct or union, do you
>> have a citation?
>
> 6.7.2.1, p3:
>
>   A structure or union shall not contain a member with incomplete
>   or function type [...], except that the last member of a structure
>   with more than one named member may have incomplete array type; such
>   a structure (and any union containing, possibly recursively, a member
>   that is such a structure) shall not be a member of a structure or
>   an element of an array.

Ah, thanks.  I note that this says "structure or union" at the
beginning of the paragraph but not at the end, which suggests strongly
to me that such a structure can be a member of a union.

>>>>> +  /* The type in which an anonymous struct or union containing ARRAY
>>>>> +     is defined or null if no such anonymous struct or union exists.  */
>>>>> +  tree anonctx;
>>>>
>>>> It seems clearer to find this at diagnostic time by following
>>>> TYPE_CONTEXT.
>>>
>>> I tried this approach and while it's doable (with recursion) I'm
>>> not happy with the results.  The diagnostics point to places that
>>> I think are unclear.  For example, given:
>>>
>>>   struct A { int i, a[]; };
>>>   struct B { long j, b[]; };
>>>   struct D: A, B { };
>>
>> I don't see any anonymous aggregates in this example, so how does
>> anonctx come into it?
>
> If there is no anonctx then diagnose_flexrrays will have to figure
> out whether the array is a member of an anonymous struct and if so,
> find the struct of which it's effectively a member, and otherwise
> use the array's immediate context.  The above was an example of
> the result of a simple implementation of your suggestion.

And I don't understand why it would change the output on this testcase.

> As I said, the code could probably be tweaked to produce the same result
> as it does now in both cases (anonymous and not), but at the cost
> of additional complexity and, IMO, to the detriment of clarity.
> What aspect of clarity do you find lacking in the current approach?

It just seems unnecessarily complicated.  anonctx is set to the
innermost non-anonymous-aggregate class containing the flexible array,
yes?  That is trivially derived from the array decl.

Looking at this more closely, it seems that the problem is that
"innermost non-anon class" isn't what you want for the example above;
for a that is A, and a *is* at the end of A.  So it seems that if a
were inside an anonymous struct inside A, you would get the same wrong
diagnostic output you were seeing with your anon_context function.

>>>>>   if (fmem == &flexmems
>>>>>       && !TYPE_ANONYMOUS_P (t) && !anon_aggrname_p (TYPE_IDENTIFIER (t)))
>>>>
>>>> I think you want ANON_AGGR_TYPE_P here, too.
>>>
>>> Here, ANON_AGGR_TYPE_P(t) returns false for anonymous structs such
>>> as the one below and using it would cause false positives:
>>>
>>>   struct S { int i; struct { int a[]; }; };
>>
>> Ah, it hasn't been set yet here because we don't know yet whether we're
>> declaring a named member or an anonymous aggregate.  So yes,
>> TYPE_ANONYMOUS_P is the right test here.  But why do you also check
>> anon_aggrname_p directly, when it's used to implement TYPE_ANONYMOUS_P?
>
> Because TYPE_ANONYMOUS_P is set for unnamed structs that are not
> anonymous, as in
>
>   struct S { int i; struct { int a[]; } s; };

Yes, but so is anon_aggrname_p (TYPE_IDENTIFIER).  And isn't that what
you want here?  When we call check_flexarrays for the nested struct
itself, we've only seen as far as the closing brace, so we don't know
if the struct will declare a named member or not.

> It looks like what I actually need here is anon_aggrname_p alone.

Why is that better than the anon_aggrname_p call inside
TYPE_ANONYMOUS_P?  The difference between TYPE_ANONYMOUS_P and
anon_aggrname_p (TYPE_IDENTIFIER (t)) is in its handling of

typedef struct { ... } name;

where TYPE_ANONYMOUS_P is false (because it has a name for linkage
purposes) but anon_aggrname_p (TYPE_IDENTIFIER (t)) is true.  I don't
see why you would care about this distinction.  The typedef doesn't
declare a member, so it seems uninteresting to check_flexarrays.

Speaking of which, why does your latest patch look at typedefs?  What
are you missing by looking just at FIELD_DECLs?  Even anonymous unions
have a FIELD_DECL, though you may need to exempt them from the
DECL_ARTIFICIAL check.

> Let me change that before committing the patch or posting a new
> version if one is still needed.
>
> FWIW, I spent lots of time last week learning how these different
> macros map on to language features and under what circumstances.
> I find some of them quite hard to work with because they add
> a layer of complexity between concepts in the language and their
> implementation in GCC.  In cases like this one when their names
> are deceptively close to the language concepts but they don't
> quite map 1-to-1, they can be downright misleading.

Do you have ideas about how to improve the naming?  Perhaps change
TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME?

Jason

Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

Reply via email to