Hi, Michael,
Thanks a lot for raising these questions for the parser implementation of the
new syntax.
I started thinking about how to implement this new syntax inside counted_by
attriubte
In GCC C FE. Since I have very little experience with any parser, I do want to
know
any potential implementation issues in GCC C FE with the new syntax.
Based on your examples below, there is an example coming to my mind that is a
little
tricky:
A:
constexpr int len = 20;
struct s {
int len;
int *buf __attribute__ ((counted_by (len))); // this continues to be member
‘len’, not global ‘len'
};
B:
constexpr int len = 20;
struct s {
int len;
int *buf __attribute__ ((counted_by (len+0))); //this is an expression , ‘len'
refers to the global;
};
When the parser is parsing the first identifier “len” inside the counted_by
attribute, it cannot decide
which syntax to use yet, it must look ahead at least one more word to decide,
is this okay for the
current C parser?
Thanks.
Qing
> On Apr 4, 2025, at 09:21, Michael Matz <[email protected]> wrote:
>
> Hello,
>
> On Fri, 4 Apr 2025, Bill Wendling wrote:
>
>>>> I don’t have strong preference here. As mentioned by Jacub in a
>>>> previous email, these two syntaxes can be distinguished by the number
>>>> of arguments of the attribute.
>>>>
>>>> So for GCC, there should be no issue with either old name of a new name.
>>>> However, I am not sure about CLANG. (Considering the complication with
>>>> APPLE’s implementation)
>>
>> I also don't have a strong opinion on whether we should add new
>> '*_expr' attributes. It's pretty easy to determine a single identifier
>> from a more complex expression, so it's probably okay to use the
>> current name. (I think that's what Apple has been doing internally.)
>
> Differentiating 'identifier' from 'decl' is easy (is first token a type?
> -> decl), but 'lone-ident' from 'assignment-expression' requires some
> look-ahead. It also always introduces the fun with useless
> parentheses: is "(a)" a lone identifier or not? (Normally it's not).
>
> So, your current proposal (lone-ident or declaration) is good, from a
> parsing perspective. But anything that somewhat merges lone-ident and
> anything in the general assignment-expression syntax tree requires head
> scratching, depending on parser implementation.
>
>> My initial thought is that you'd have something like this:
>>
>> struct Y {
>> int n;
>> };
>>
>> struct Z {
>> int *ints __attrbiute__((counted_by(struct Y y; y.n)));
>> struct Y y;
>> };
>>
>> And it should "just work." I'm not sure if there's an issue with this
>> though.
>
> I think it would just work in your proposal, yes. What about the typical
> expr-vs-decl woes:
>
> typedef int TY;
> struct Z {
> int TY;
> int *ints __attribute__((counted_by(TY))); // type or lone-ident?
> };
>
> when the parser sees the 'TY' token in counted_by (without consuming it):
> does it go your first (lone-ident) or your second (decl) branch? (Of
> course the second would lead to a syntax error, but we don't know yet,
> we've seen the TY token and know that it could refer to a type).
>
> The normal thing a parser would do is to go the second route (and lead to
> syntax error). It shouldn't go the first route (lone-ident), as otherwise
> you again have a confusion with:
>
> typedef int TY;
> struct Z1 {
> int *ints __attribute__((counted_by(TY))); // can only be type
> int TY;
> };
>
> which clearly is a syntax error. (Trying to avoid going the decl route
> would also need another look-ahead)
>
> Anyway, I think your current proposal as-is (lone-ident | decl+expression)
> is workable.
>
>
> Ciao,
> Michael.