On Tue, Jul 23, 2024 at 05:12:14PM -0400, Jason Merrill wrote: > On 7/23/24 4:18 PM, Marek Polacek wrote: > > On Tue, Jul 23, 2024 at 12:53:07AM -0400, Jason Merrill wrote: > > > On 7/20/24 2:31 PM, Marek Polacek wrote: > > > > [ Entering the contest to fix the oldest PR in this cycle. ] > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > > > > -- >8 -- > > > > This 18-year-old PR reports that we parse certain comma expressions > > > > as a declaration rather than statement when the statement begins with > > > > a functional-style cast expression. Consider > > > > > > > > int(x), 0; > > > > > > > > which does not declare x--it only casts x to int--, whereas > > > > > > > > int(x), (y); > > > > > > > > declares x and y. We need some kind of look-ahead to decide how we > > > > should disambiguate the construct, because cp_parser_init_declarator > > > > commits eagerly once it has seen "int(x)", and then it's too late to > > > > recover. > > > > > > > > This patch makes us try to parse the code as a sequence of declarators; > > > > if that fails, we are likely looking at a statement. That's a simple > > > > idea, but it's complicated by code like > > > > > > > > void (*p)(void *)(fun); > > > > > > > > which initializes a pointer-to-function, or > > > > > > > > int(x), (x) + 1; > > > > > > > > which is an expression statement, but the second (x) is parsed as > > > > a valid declarator, only the + after reveals that the whole thing > > > > is an expression. You can have things like > > > > > > > > int(**p) > > > > > > > > which by itself doesn't tell you much. You can have > > > > > > > > int(*q)(void*) > > > > > > > > which looks like it starts with a functional-style cast, but it is not > > > > a cast. The simple > > > > > > > > int(x) = 42; > > > > > > > > has an initializer so it declares x; it is not an assignment. But then, > > > > > > > > int(d) __attribute__(()); > > > > > > > > does not have an initializer, but the attribute makes it a declaration. > > > > > > > > PR c++/29834 > > > > PR c++/54905 > > > > > > > > gcc/cp/ChangeLog: > > > > > > > > * parser.cc (cp_parser_lambda_introducer): Use > > > > cp_parser_next_token_starts_initializer_p. > > > > (cp_parser_simple_declaration): Add look-ahead to decide if > > > > we're > > > > looking at a declaration or statement. > > > > (cp_parser_next_token_starts_initializer_p): New. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * g++.dg/parse/ambig15.C: New test. > > > > * g++.dg/parse/ambig16.C: New test. > > > > --- > > > > gcc/cp/parser.cc | 73 ++++++++++++++++++++++-- > > > > gcc/testsuite/g++.dg/parse/ambig15.C | 83 > > > > ++++++++++++++++++++++++++++ > > > > gcc/testsuite/g++.dg/parse/ambig16.C | 18 ++++++ > > > > 3 files changed, 168 insertions(+), 6 deletions(-) > > > > create mode 100644 gcc/testsuite/g++.dg/parse/ambig15.C > > > > create mode 100644 gcc/testsuite/g++.dg/parse/ambig16.C > > > > > > > > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc > > > > index 1fa0780944b..797cfc3204e 100644 > > > > --- a/gcc/cp/parser.cc > > > > +++ b/gcc/cp/parser.cc > > > > @@ -2947,6 +2947,8 @@ static bool > > > > cp_parser_next_token_ends_template_argument_p > > > > (cp_parser *); > > > > static bool cp_parser_nth_token_starts_template_argument_list_p > > > > (cp_parser *, size_t); > > > > +static bool cp_parser_next_token_starts_initializer_p > > > > + (cp_parser *); > > > > static enum tag_types cp_parser_token_is_class_key > > > > (cp_token *); > > > > static enum tag_types cp_parser_token_is_type_parameter_key > > > > @@ -11663,9 +11665,7 @@ cp_parser_lambda_introducer (cp_parser* parser, > > > > tree lambda_expr) > > > > } > > > > /* Find the initializer for this capture. */ > > > > - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) > > > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN) > > > > - || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) > > > > + if (cp_parser_next_token_starts_initializer_p (parser)) > > > > { > > > > /* An explicit initializer exists. */ > > > > if (cxx_dialect < cxx14) > > > > @@ -11747,9 +11747,7 @@ cp_parser_lambda_introducer (cp_parser* parser, > > > > tree lambda_expr) > > > > /* If what follows is an initializer, the second > > > > '...' is > > > > invalid. But for cases like [...xs...], the first > > > > one > > > > is invalid. */ > > > > - if (cp_lexer_next_token_is (parser->lexer, CPP_EQ) > > > > - || cp_lexer_next_token_is (parser->lexer, > > > > CPP_OPEN_PAREN) > > > > - || cp_lexer_next_token_is (parser->lexer, > > > > CPP_OPEN_BRACE)) > > > > + if (cp_parser_next_token_starts_initializer_p > > > > (parser)) > > > > ellipsis_loc = loc; > > > > error_at (ellipsis_loc, "too many %<...%> in lambda > > > > capture"); > > > > continue; > > > > @@ -16047,6 +16045,58 @@ cp_parser_simple_declaration (cp_parser* > > > > parser, > > > > else > > > > break; > > > > + /* If we are still uncommitted, we're probably looking at something > > > > like > > > > + T(x), which can be a declaration but does not have to be, > > > > depending > > > > + on what comes after. Consider > > > > + int(x), 0; > > > > + which is _not_ a declaration of x, it's a functional cast, and > > > > + int(x), (y); > > > > + which declares x and y. We need some kind of look-ahead to > > > > decide, > > > > + cp_parser_init_declarator below will commit eagerly once it has > > > > seen > > > > + "int(x)". So we try to parse this as a sequence of declarators; > > > > if > > > > + that fails, we are likely looking at a statement. (We could avoid > > > > + all of this if there is no non-nested comma.) */ > > > > > > Unfortunately, this seems to break > > > > > > int main() > > > { > > > int(x), y[sizeof(x)]; > > > } > > > > Oy. That's a problem, thanks for catching that. So: > > > > void f(int v, int *w) > > { > > int(x), y[sizeof(x)]; // decl > > int(v), w[sizeof(v)], true; // expr > > } > > > > The problem is that the cp_parser_declarator call I added was meant to > > be purely syntactic checking -- does it _look_ like it could be a decl? -- > > but here it emits an error. > > > > Is it worth experimenting with a new CP_PARSER_SYNTAX_ONLY flag to > > prevent emitting any errors? > > I've wished for a while that we would store diagnostics during tentative > parsing and then emit them all once we commit to the tentative parse.
Yea, it's come up a few times, hasn't it. It would fix https://gcc.gnu.org/PR61259 https://gcc.gnu.org/PR109775 and I wonder what else. > With that functionality, we could parse the decls normally (but tentatively) > and then go back and remove them from the current scope if they turn out not > to have actually been declarations? > > Extremely vexing. It's an interesting project, but these PRs are pretty obscure and have fairly simple workarounds. I think I better address other projects first... Well, it was worth a shot. Maybe I can at least add a testcase with your testcase (we don't test it currently), and extract whatever already works from my new test. Marek