On Tue, Jul 23, 2024 at 12:53:07AM -0400, Jason Merrill wrote: > On 7/20/24 2:31 PM, Marek Polacek wrote: > > [ Entering the contest to fix the oldest PR in this cycle. ] > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > This 18-year-old PR reports that we parse certain comma expressions > > as a declaration rather than statement when the statement begins with > > a functional-style cast expression. Consider > > > > int(x), 0; > > > > which does not declare x--it only casts x to int--, whereas > > > > int(x), (y); > > > > declares x and y. We need some kind of look-ahead to decide how we > > should disambiguate the construct, because cp_parser_init_declarator > > commits eagerly once it has seen "int(x)", and then it's too late to > > recover. > > > > This patch makes us try to parse the code as a sequence of declarators; > > if that fails, we are likely looking at a statement. That's a simple > > idea, but it's complicated by code like > > > > void (*p)(void *)(fun); > > > > which initializes a pointer-to-function, or > > > > int(x), (x) + 1; > > > > which is an expression statement, but the second (x) is parsed as > > a valid declarator, only the + after reveals that the whole thing > > is an expression. You can have things like > > > > int(**p) > > > > which by itself doesn't tell you much. You can have > > > > int(*q)(void*) > > > > which looks like it starts with a functional-style cast, but it is not > > a cast. The simple > > > > int(x) = 42; > > > > has an initializer so it declares x; it is not an assignment. But then, > > > > int(d) __attribute__(()); > > > > does not have an initializer, but the attribute makes it a declaration. > > > > PR c++/29834 > > PR c++/54905 > > > > gcc/cp/ChangeLog: > > > > * parser.cc (cp_parser_lambda_introducer): Use > > cp_parser_next_token_starts_initializer_p. > > (cp_parser_simple_declaration): Add look-ahead to decide if we're > > looking at a declaration or statement. > > (cp_parser_next_token_starts_initializer_p): New. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/parse/ambig15.C: New test. > > * g++.dg/parse/ambig16.C: New test. > > --- > > gcc/cp/parser.cc | 73 ++++++++++++++++++++++-- > > gcc/testsuite/g++.dg/parse/ambig15.C | 83 ++++++++++++++++++++++++++++ > > gcc/testsuite/g++.dg/parse/ambig16.C | 18 ++++++ > > 3 files changed, 168 insertions(+), 6 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/parse/ambig15.C > > create mode 100644 gcc/testsuite/g++.dg/parse/ambig16.C > > > > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc > > index 1fa0780944b..797cfc3204e 100644 > > --- a/gcc/cp/parser.cc > > +++ b/gcc/cp/parser.cc > > @@ -2947,6 +2947,8 @@ static bool > > cp_parser_next_token_ends_template_argument_p > > (cp_parser *); > > static bool cp_parser_nth_token_starts_template_argument_list_p > > (cp_parser *, size_t); > > +static bool cp_parser_next_token_starts_initializer_p > > + (cp_parser *); > > static enum tag_types cp_parser_token_is_class_key > > (cp_token *); > > static enum tag_types cp_parser_token_is_type_parameter_key > > @@ -11663,9 +11665,7 @@ cp_parser_lambda_introducer (cp_parser* parser, > > tree lambda_expr) > > } > > /* Find the initializer for this capture. */ > > - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) > > + if (cp_parser_next_token_starts_initializer_p (parser)) > > { > > /* An explicit initializer exists. */ > > if (cxx_dialect < cxx14) > > @@ -11747,9 +11747,7 @@ cp_parser_lambda_introducer (cp_parser* parser, > > tree lambda_expr) > > /* If what follows is an initializer, the second '...' is > > invalid. But for cases like [...xs...], the first one > > is invalid. */ > > - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) > > + if (cp_parser_next_token_starts_initializer_p (parser)) > > ellipsis_loc = loc; > > error_at (ellipsis_loc, "too many %<...%> in lambda capture"); > > continue; > > @@ -16047,6 +16045,58 @@ cp_parser_simple_declaration (cp_parser* parser, > > else > > break; > > + /* If we are still uncommitted, we're probably looking at something like > > + T(x), which can be a declaration but does not have to be, depending > > + on what comes after. Consider > > + int(x), 0; > > + which is _not_ a declaration of x, it's a functional cast, and > > + int(x), (y); > > + which declares x and y. We need some kind of look-ahead to decide, > > + cp_parser_init_declarator below will commit eagerly once it has seen > > + "int(x)". So we try to parse this as a sequence of declarators; if > > + that fails, we are likely looking at a statement. (We could avoid > > + all of this if there is no non-nested comma.) */ > > Unfortunately, this seems to break > > int main() > { > int(x), y[sizeof(x)]; > }
Oy. That's a problem, thanks for catching that. So: void f(int v, int *w) { int(x), y[sizeof(x)]; // decl int(v), w[sizeof(v)], true; // expr } The problem is that the cp_parser_declarator call I added was meant to be purely syntactic checking -- does it _look_ like it could be a decl? -- but here it emits an error. Is it worth experimenting with a new CP_PARSER_SYNTAX_ONLY flag to prevent emitting any errors? Marek