On 1/3/19 8:40 AM, David Malcolm wrote:
On Thu, 2019-01-03 at 15:59 +0100, Daniel Marjamäki wrote:
Hello!

I have used GCC for decades and would like to contribute a little. :-
)

Hi, and welcome!

I would like to see if I can improve the syntax errors.

Here is one example code:

     int x = 3) + 0;

I have created this ugly experimental patch:

--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -2228,7 +2228,10 @@ c_parser_declaration_or_fndef (c_parser
*parser, bool fndef_ok,
             }
           else
             {
-             c_parser_error (parser, "expected %<,%> or %<;%>");
+             if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
+               c_parser_error (parser, "extraneous )");
+             else
+               c_parser_error (parser, "expected %<,%> or %<;%>");
               c_parser_skip_to_end_of_block_or_statement (parser);
               return;
             }

Before my patch:
test1.c:3:12: error: expected ‘,’ or ‘;’ before ‘)’ token

After my patch:
test1.c:3:12: error: extraneous ) before ‘)’ token

That is not perfect neither.

Thanks for trying it out.

A minor nit: the ")" is a source code construct and thus should be
quoted in the message, by wrapping it in %< and %> like in the existing
code:

   c_parser_error (parser, "extraneous %<)%>");

or by using %qs to quote a const const *:

   c_parser_error (parser, "extraneous %qs", ")");

I would suggest the quoted form since it means less work for message
translantors.  (Both forms are used in GCC messages but I think it
would be worth adopting the quoted form.)

(FWIW, the word "extraneous" seems a bit "jargony" to me, how about
"stray %qs token"?  (giving: "error: stray ')' token")   I'm not sure
though)

A helpful rule of thumb I like to use is checking which form is
prevalent in gcc/po/gcc.pot.  In this case, there's just one
message with the word "extraneous" in all of GCC and 7 that
use "stray."  There are also three messages that use the word
"unmatched", and for variety, five that use "unterminated" (not
all of them are interchangeable).

Martin

I have 2 questions..
  * can somebody give me a hint how I improve the error message so it
does not say "before ) token"?

The before ')' token is being supplied due to the use of
c_parser_error, which calls c_parse_error, which adds a
   "before SOMETHING"
suffix to the message before calling into the diagnostic subsystem.
You could use:

   error_at (token->location, "some message");

to go directly to the underlying diagnostic API to avoid getting the
"before SOMETHING" suffix.

  * how do I run the tests?

You might want to look at this guide I put together:
   https://dmalcolm.fedorapeople.org/gcc/newbies-guide/index.html
which among other things has some notes on running tests (and writing
new ones), and on stepping through gcc in a debugger.

Basically I want that when there is a unmatched extra ) or } or ]
then
it should just say "extraneous .." instead of "expected ',' or ';'.
Adding a ',' or ';' in the example code will not fix the syntax
error.

I wonder how much we want to special-case this.  Are you thinking about
the case where there's a stray symbol in the code (perhaps due to a
stray keypress, or unfinished manual edits)?   Perhaps we could have a
predicate for determining if a token can never make sense in the given
context, and have something like:

if (c_parser_next_token_is_meaningless_p (parser))
   complain_about_stray_token (parser);
else
   ...

or somesuch (we use a "_p" suffix for predicates).

It might make sense to emit a fix-it hint suggesting the removal of the
stray token.

Much of these ideas could probably apply to the C++ frontend as well
(annoyingly, not much of this code is shared between C and C++).

Best regards,
Daniel Marjamäki

Hope this is helpful, and welcome again.

Dave


Reply via email to