On Sun, Jul 31, 2016 at 4:27 PM, Martin Sebor <mse...@gmail.com> wrote: > On 07/31/2016 10:28 AM, Jason Merrill wrote: >> On Fri, Jul 29, 2016 at 7:22 PM, Martin Sebor <mse...@gmail.com> wrote: >>> On 07/26/2016 12:53 PM, Jason Merrill wrote: >>>> On 07/23/2016 01:18 PM, Martin Sebor wrote: >>>>> >>>>> + /* A pair of the first non-static non-empty data members following >>>>> + either the flexible array member, if found, or the zero-length >>>>> + array member otherwise. AFTER[1] refers to the first such data >>>>> + member of a union that the struct containing the flexible array >>>>> + member or zero-length array is a member, or NULL when no such >>>>> + union exists. AFTER[0] refers to the first such data member >>>>> + that is not a member of such a union. */ >>>> >>>> This is pretty hard to follow, could you add an example? Why do we want >>>> to track these separately rather than look at DECL_CONTEXT at diagnostic >>>> time? >>> >>> Sure, I've added an example. >>> >>> Whether or not a given flexible array member is valid depends not >>> only on the context in which it's defined (its enclosing class) but >>> also on the members of other classes whose objects may be defined >>> in the same or other contexts (e.g., enclosing structs). I don't >>> know of a way to reach those other members from the context of >>> the ARRAY. >> >> Not from the context of the array, but when we see a field following the >> flexible array (or aggregate ending in a flexible array), can't we look >> at the DECL_CONTEXT of that field and see whether it's a union or not? > > I don't think that would work in cases like this: > > struct S1 { > struct S2 { int i, a[]; } s2; > union U { int x; } u; > }; > > that need to be treated differently from this one: > > union U1 { > struct S { int i, a[]; } s; > union U2 { int x; } u2; > };
Ah, I'm thinking of the following field as u/u2 rather than x. Why does it improve clarity to look inside U/U2 for a following field? >>> + For example, in the following, the flexible array member >>> + S::U::X::a overlaps S::U::Y::i and so AFTER[1] is set to refer to >>> + the latter. This potential problem is independent of union U's >>> + membership in struct S. In addition, in the definition of struct >>> + S, S::U::x::a is followed by S::z, and so AFTER[0] is set to refer >>> + to the latter. The two problems result in two diagnostics, the >>> + first one being a pedantic warning and the second a hard error. >>> + >>> + struct S { >>> + union U { >>> + struct X { int i, a[]; } x; >>> + struct Y { long i, a[]; } y; >>> + } u; >>> + int z; >>> + }; >> >> >> Hmm, I'm not convinced that the first problem is really a problem. Only >> one of the union members is active at any time, so overlapping with >> another union member seems is irrelevant. >> >> I also can't find the language in the C standard that prohibits nesting >> a struct ending with a flexible array in another struct or union, do you >> have a citation? > > 6.7.2.1, p3: > > A structure or union shall not contain a member with incomplete > or function type [...], except that the last member of a structure > with more than one named member may have incomplete array type; such > a structure (and any union containing, possibly recursively, a member > that is such a structure) shall not be a member of a structure or > an element of an array. Ah, thanks. I note that this says "structure or union" at the beginning of the paragraph but not at the end, which suggests strongly to me that such a structure can be a member of a union. >>>>> + /* The type in which an anonymous struct or union containing ARRAY >>>>> + is defined or null if no such anonymous struct or union exists. */ >>>>> + tree anonctx; >>>> >>>> It seems clearer to find this at diagnostic time by following >>>> TYPE_CONTEXT. >>> >>> I tried this approach and while it's doable (with recursion) I'm >>> not happy with the results. The diagnostics point to places that >>> I think are unclear. For example, given: >>> >>> struct A { int i, a[]; }; >>> struct B { long j, b[]; }; >>> struct D: A, B { }; >> >> I don't see any anonymous aggregates in this example, so how does >> anonctx come into it? > > If there is no anonctx then diagnose_flexrrays will have to figure > out whether the array is a member of an anonymous struct and if so, > find the struct of which it's effectively a member, and otherwise > use the array's immediate context. The above was an example of > the result of a simple implementation of your suggestion. And I don't understand why it would change the output on this testcase. > As I said, the code could probably be tweaked to produce the same result > as it does now in both cases (anonymous and not), but at the cost > of additional complexity and, IMO, to the detriment of clarity. > What aspect of clarity do you find lacking in the current approach? It just seems unnecessarily complicated. anonctx is set to the innermost non-anonymous-aggregate class containing the flexible array, yes? That is trivially derived from the array decl. Looking at this more closely, it seems that the problem is that "innermost non-anon class" isn't what you want for the example above; for a that is A, and a *is* at the end of A. So it seems that if a were inside an anonymous struct inside A, you would get the same wrong diagnostic output you were seeing with your anon_context function. >>>>> if (fmem == &flexmems >>>>> && !TYPE_ANONYMOUS_P (t) && !anon_aggrname_p (TYPE_IDENTIFIER (t))) >>>> >>>> I think you want ANON_AGGR_TYPE_P here, too. >>> >>> Here, ANON_AGGR_TYPE_P(t) returns false for anonymous structs such >>> as the one below and using it would cause false positives: >>> >>> struct S { int i; struct { int a[]; }; }; >> >> Ah, it hasn't been set yet here because we don't know yet whether we're >> declaring a named member or an anonymous aggregate. So yes, >> TYPE_ANONYMOUS_P is the right test here. But why do you also check >> anon_aggrname_p directly, when it's used to implement TYPE_ANONYMOUS_P? > > Because TYPE_ANONYMOUS_P is set for unnamed structs that are not > anonymous, as in > > struct S { int i; struct { int a[]; } s; }; Yes, but so is anon_aggrname_p (TYPE_IDENTIFIER). And isn't that what you want here? When we call check_flexarrays for the nested struct itself, we've only seen as far as the closing brace, so we don't know if the struct will declare a named member or not. > It looks like what I actually need here is anon_aggrname_p alone. Why is that better than the anon_aggrname_p call inside TYPE_ANONYMOUS_P? The difference between TYPE_ANONYMOUS_P and anon_aggrname_p (TYPE_IDENTIFIER (t)) is in its handling of typedef struct { ... } name; where TYPE_ANONYMOUS_P is false (because it has a name for linkage purposes) but anon_aggrname_p (TYPE_IDENTIFIER (t)) is true. I don't see why you would care about this distinction. The typedef doesn't declare a member, so it seems uninteresting to check_flexarrays. Speaking of which, why does your latest patch look at typedefs? What are you missing by looking just at FIELD_DECLs? Even anonymous unions have a FIELD_DECL, though you may need to exempt them from the DECL_ARTIFICIAL check. > Let me change that before committing the patch or posting a new > version if one is still needed. > > FWIW, I spent lots of time last week learning how these different > macros map on to language features and under what circumstances. > I find some of them quite hard to work with because they add > a layer of complexity between concepts in the language and their > implementation in GCC. In cases like this one when their names > are deceptively close to the language concepts but they don't > quite map 1-to-1, they can be downright misleading. Do you have ideas about how to improve the naming? Perhaps change TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME? Jason