On Tue, Jul 23, 2024 at 12:53:07AM -0400, Jason Merrill wrote:
> On 7/20/24 2:31 PM, Marek Polacek wrote:
> > [ Entering the contest to fix the oldest PR in this cycle. ]
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > This 18-year-old PR reports that we parse certain comma expressions
> > as a declaration rather than statement when the statement begins with
> > a functional-style cast expression.  Consider
> > 
> >    int(x), 0;
> > 
> > which does not declare x--it only casts x to int--, whereas
> > 
> >    int(x), (y);
> > 
> > declares x and y.  We need some kind of look-ahead to decide how we
> > should disambiguate the construct, because cp_parser_init_declarator
> > commits eagerly once it has seen "int(x)", and then it's too late to
> > recover.
> > 
> > This patch makes us try to parse the code as a sequence of declarators;
> > if that fails, we are likely looking at a statement.  That's a simple
> > idea, but it's complicated by code like
> > 
> >    void (*p)(void *)(fun);
> > 
> > which initializes a pointer-to-function, or
> > 
> >    int(x), (x) + 1;
> > 
> > which is an expression statement, but the second (x) is parsed as
> > a valid declarator, only the + after reveals that the whole thing
> > is an expression.  You can have things like
> > 
> >    int(**p)
> > 
> > which by itself doesn't tell you much.  You can have
> > 
> >    int(*q)(void*)
> > 
> > which looks like it starts with a functional-style cast, but it is not
> > a cast.  The simple
> > 
> >    int(x) = 42;
> > 
> > has an initializer so it declares x; it is not an assignment.  But then,
> > 
> >      int(d) __attribute__(());
> > 
> > does not have an initializer, but the attribute makes it a declaration.
> > 
> >     PR c++/29834
> >     PR c++/54905
> > 
> > gcc/cp/ChangeLog:
> > 
> >     * parser.cc (cp_parser_lambda_introducer): Use
> >     cp_parser_next_token_starts_initializer_p.
> >     (cp_parser_simple_declaration): Add look-ahead to decide if we're
> >     looking at a declaration or statement.
> >     (cp_parser_next_token_starts_initializer_p): New.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >     * g++.dg/parse/ambig15.C: New test.
> >     * g++.dg/parse/ambig16.C: New test.
> > ---
> >   gcc/cp/parser.cc                     | 73 ++++++++++++++++++++++--
> >   gcc/testsuite/g++.dg/parse/ambig15.C | 83 ++++++++++++++++++++++++++++
> >   gcc/testsuite/g++.dg/parse/ambig16.C | 18 ++++++
> >   3 files changed, 168 insertions(+), 6 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/parse/ambig15.C
> >   create mode 100644 gcc/testsuite/g++.dg/parse/ambig16.C
> > 
> > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > index 1fa0780944b..797cfc3204e 100644
> > --- a/gcc/cp/parser.cc
> > +++ b/gcc/cp/parser.cc
> > @@ -2947,6 +2947,8 @@ static bool 
> > cp_parser_next_token_ends_template_argument_p
> >     (cp_parser *);
> >   static bool cp_parser_nth_token_starts_template_argument_list_p
> >     (cp_parser *, size_t);
> > +static bool cp_parser_next_token_starts_initializer_p
> > +  (cp_parser *);
> >   static enum tag_types cp_parser_token_is_class_key
> >     (cp_token *);
> >   static enum tag_types cp_parser_token_is_type_parameter_key
> > @@ -11663,9 +11665,7 @@ cp_parser_lambda_introducer (cp_parser* parser, 
> > tree lambda_expr)
> >     }
> >         /* Find the initializer for this capture.  */
> > -      if (cp_lexer_next_token_is (parser->lexer, CPP_EQ)
> > -     || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)
> > -     || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
> > +      if (cp_parser_next_token_starts_initializer_p (parser))
> >     {
> >       /* An explicit initializer exists.  */
> >       if (cxx_dialect < cxx14)
> > @@ -11747,9 +11747,7 @@ cp_parser_lambda_introducer (cp_parser* parser, 
> > tree lambda_expr)
> >               /* If what follows is an initializer, the second '...' is
> >                  invalid.  But for cases like [...xs...], the first one
> >                  is invalid.  */
> > -             if (cp_lexer_next_token_is (parser->lexer, CPP_EQ)
> > -                 || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)
> > -                 || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
> > +             if (cp_parser_next_token_starts_initializer_p (parser))
> >                 ellipsis_loc = loc;
> >               error_at (ellipsis_loc, "too many %<...%> in lambda capture");
> >               continue;
> > @@ -16047,6 +16045,58 @@ cp_parser_simple_declaration (cp_parser* parser,
> >       else
> >         break;
> > +  /* If we are still uncommitted, we're probably looking at something like
> > +     T(x), which can be a declaration but does not have to be, depending
> > +     on what comes after.  Consider
> > +       int(x), 0;
> > +     which is _not_ a declaration of x, it's a functional cast, and
> > +       int(x), (y);
> > +     which declares x and y.  We need some kind of look-ahead to decide,
> > +     cp_parser_init_declarator below will commit eagerly once it has seen
> > +     "int(x)".  So we try to parse this as a sequence of declarators; if
> > +     that fails, we are likely looking at a statement.  (We could avoid
> > +     all of this if there is no non-nested comma.)  */
> 
> Unfortunately, this seems to break
> 
> int main()
> {
>   int(x), y[sizeof(x)];
> }

Oy.  That's a problem, thanks for catching that.  So:

  void f(int v, int *w)
  {
    int(x), y[sizeof(x)]; // decl
    int(v), w[sizeof(v)], true; // expr
  }

The problem is that the cp_parser_declarator call I added was meant to
be purely syntactic checking -- does it _look_ like it could be a decl? --
but here it emits an error.

Is it worth experimenting with a new CP_PARSER_SYNTAX_ONLY flag to
prevent emitting any errors?

Marek

Reply via email to