Hi, On Fri, May 22, 2020 at 09:42:28AM -0400, David Malcolm wrote: > On Fri, 2020-05-22 at 01:30 +0200, Mark Wielaard wrote: > > This is on top of the stdbool.h and stdint.h patches. > > Sorry, I didn't see those patches; I've replied to them now.
No worries, there was no hurry and I didn't CC you on those. I pushed them both now. Thanks. > > This adds a flag to c_parser so we know when we were trying to > > constract a string literal. If there is a parse error and we were > > constructing a string literal, and the next token is an unknown > > identifier name, and we know there is a standard header that defines > > that name as a string literal, then add a missing header hint to > > the error messsage. > > By comparison, what's the existing behavior for the example? > Is it just what you posted below, but without the "note" diagnostics? Yes. > > I haven't checked yet if we can do something similar for the C++ > > parser. And the testcase needs to be extended a bit. But I hope the > > direction is OK. > > I think it's a worthy goal; as noted below I'd want a C frontend > maintainer to approve those parts of it. OK. > > $ /opt/local/gcc/bin/gcc -c t.c > > t.c:4:26: error: ‘PRIx64’ undeclared here (not in a function) > > 4 | const char *hex64_fmt = PRIx64; > > | ^~~~~~ > > t.c:3:1: note: ‘PRIx64’ is defined in header ‘<inttypes.h>’; did you forget > > to ‘#include <inttypes.h>’? > > 2 | #include <stdint.h> > > +++ |+#include <inttypes.h> > > 3 | > > t.c:5:35: error: expected ‘,’ or ‘;’ before ‘PRIx64’ > > 5 | const char *msg_fmt = "Provide %" PRIx64 "\n"; > > | ^~~~~~ > > t.c:5:35: note: ‘PRIx64’ is defined in header ‘<inttypes.h>’; did you > > forget to ‘#include <inttypes.h>’? > > It's a big improvement, but it's still a little clunky. The > expected ‘,’ or ‘;’ before ‘PRIx64’ > is the sort of thing we've gotten used to with years of GCC messages, > but might make little sense to a neophyte. > I wonder if it's possible to improve this further, but I fear that that > might overcomplicate things (if I understand things correctly, the > string concatenation fails in the tokenizer, and then the PRIx64 fails > in the parser, so we have a bad interaction involving two different > levels of abstraction within the frontend - I think). Yes, part of the string concatenation is done by the tokenizer, the other part by the parser. With this patch we now track when we were trying to concatenate a string. Then if any error occurs and we notice an unknown token is next that could be a string literal then we add the hint to the error message with an alternative suggestion/hint for a fix. I didn't want to override the existing error message, because it might be right or better, and we don't actually know whether with the suggestion the rest will actully parse correctly. > > +/* These can be used as string macros or as identifiers. Must all be > > + string-literals. Used in get_stdlib_header_for_name and > > + get_c_stdlib_header_for_string_macro_name. */ > > +static const stdlib_hint c99_cxx11_macro_hints[] = { > > + /* <inttypes.h> and <cinttypes>. */ > > + {"PRId8", {"<inttypes.h>", "<cinttypes>"} }, > > [...] > > + {"SCNxPTR", {"<inttypes.h>", "<cinttypes>"} } > > +}; > > I found myself squinting at the array trying to decide if every entry > had > {"<inttypes.h>", "<cinttypes>"} > as its second element. I *think* that's the case, right? > > I know there's a lot of pre-existing duplication in this file, but > maybe this would be better as simply an array of const char *? > It could probably be reformatted to take up far fewer lines by grouping > them logically. > > Maybe have a > > static const char * > get_c99_cxx11_macro_hint (const char *, enum stdlib lib); > > to do the predicate, and return one of "<inttypes.h>", "<cinttypes>", or NULL? Yes, that is actually better. And much easier to read. And the code can still be shared between get_c_stdlib_header_for_string_macro_name and get_stdlib_header_for_name. Changed in the attached patch. I also extended the testcase so it covers both identifiers and string concatenation cases. The identifier hints also work for C++, but I haven't yet added the string token concatenation detection to the cp_parser. It looks like it can be done similar as done for the c_parser, but the parsers are different enough that it seems better/simpler to do that in a followup patch once we have this patch in the the c_parser. Thanks, Mark
>From 98be2e83b6d0e69d769e8eff0f4b815240e51e1c Mon Sep 17 00:00:00 2001 From: Mark Wielaard <m...@klomp.org> Date: Fri, 22 May 2020 01:10:50 +0200 Subject: [PATCH] Provide diagnostic hints for missing inttypes.h string constants. This adds a flag to c_parser so we know when we were trying to construct a string literal. If there is a parse error and we were constructing a string literal, and the next token is an unknown identifier name, and we know there is a standard header that defines that name as a string literal, then add a missing header hint to the error messsage. gcc/c-family/ChangeLog: * known-headers.cc (get_string_macro_hint): New function. (get_stdlib_header_for_name): Use get_string_macro_hint. (get_c_stdlib_header_for_string_macro_name): New function. * known-headers.h (get_c_stdlib_header_for_string_macro_name): New function definition. gcc/c/ChangeLog: * c-parser.c (struct c_parser): Add seen_string_literal bitfield. (c_parser_consume_token): Reset seen_string_literal. (c_parser_error_richloc): Add name_hint if seen_string_literal and next token is a CPP_NAME and we have a missing header suggestion for the name. (c_parser_string_literal): Set seen_string_literal. gcc/testsuite/ChangeLog: * gcc.dg/spellcheck-inttypes.c: New test. * g++.dg/spellcheck-inttypes.C: Likewise. --- gcc/c-family/known-headers.cc | 53 ++++++++++++++- gcc/c-family/known-headers.h | 2 + gcc/c/c-parser.c | 29 ++++++++ gcc/testsuite/g++.dg/spellcheck-inttypes.C | 41 ++++++++++++ gcc/testsuite/gcc.dg/spellcheck-inttypes.c | 78 ++++++++++++++++++++++ 5 files changed, 202 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/spellcheck-inttypes.C create mode 100644 gcc/testsuite/gcc.dg/spellcheck-inttypes.c diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc index 1e2bf49c439a..c07cfd1db815 100644 --- a/gcc/c-family/known-headers.cc +++ b/gcc/c-family/known-headers.cc @@ -46,6 +46,49 @@ struct stdlib_hint const char *header[NUM_STDLIBS]; }; +/* Given non-NULL NAME, return the header name defining it (as literal + string) within either the standard library (with '<' and '>'), or + NULL. + + Only handle string macros, so that this can be used for + get_stdlib_header_for_name and + get_c_stdlib_header_for_string_macro_name. */ + +static const char * +get_string_macro_hint (const char *name, enum stdlib lib) +{ + /* <inttypes.h> and <cinttypes>. */ + static const char *c99_cxx11_macros[] = + { "PRId8", "PRId16", "PRId32", "PRId64", + "PRIi8", "PRIi16", "PRIi32", "PRIi64", + "PRIo8", "PRIo16", "PRIo32", "PRIo64", + "PRIu8", "PRIu16", "PRIu32", "PRIu64", + "PRIx8", "PRIx16", "PRIx32", "PRIx64", + "PRIX8", "PRIX16", "PRIX32", "PRIX64", + + "PRIdPTR", "PRIiPTR", "PRIoPTR", "PRIuPTR", "PRIxPTR", "PRIXPTR", + + "SCNd8", "SCNd16", "SCNd32", "SCNd64", + "SCNi8", "SCNi16", "SCNi32", "SCNi64", + "SCNo8", "SCNo16", "SCNo32", "SCNo64", + "SCNu8", "SCNu16", "SCNu32", "SCNu64", + "SCNx8", "SCNx16", "SCNx32", "SCNx64", + + "SCNdPTR", "SCNiPTR", "SCNoPTR", "SCNuPTR", "SCNxPTR" }; + + if ((lib == STDLIB_C && flag_isoc99) + || (lib == STDLIB_CPLUSPLUS && cxx_dialect >= cxx11 )) + { + const size_t num_c99_cxx11_macros + = sizeof (c99_cxx11_macros) / sizeof (c99_cxx11_macros[0]); + for (size_t i = 0; i < num_c99_cxx11_macros; i++) + if (strcmp (name, c99_cxx11_macros[i]) == 0) + return lib == STDLIB_C ? "<inttypes.h>" : "<cinttypes>"; + } + + return NULL; +} + /* Given non-NULL NAME, return the header name defining it within either the standard library (with '<' and '>'), or NULL. Only handles a subset of the most common names within the stdlibs. */ @@ -196,7 +239,7 @@ get_stdlib_header_for_name (const char *name, enum stdlib lib) if (strcmp (name, c99_cxx11_hints[i].name) == 0) return c99_cxx11_hints[i].header[lib]; - return NULL; + return get_string_macro_hint (name, lib); } /* Given non-NULL NAME, return the header name defining it within the C @@ -217,6 +260,14 @@ get_cp_stdlib_header_for_name (const char *name) return get_stdlib_header_for_name (name, STDLIB_CPLUSPLUS); } +/* Given non-NULL NAME, return the header name defining a string macro + within the C standard library (with '<' and '>'), or NULL. */ +const char * +get_c_stdlib_header_for_string_macro_name (const char *name) +{ + return get_string_macro_hint (name, STDLIB_C); +} + /* Implementation of class suggest_missing_header. */ /* suggest_missing_header's ctor. */ diff --git a/gcc/c-family/known-headers.h b/gcc/c-family/known-headers.h index 4ec4e23fc88d..a69bbbf28e76 100644 --- a/gcc/c-family/known-headers.h +++ b/gcc/c-family/known-headers.h @@ -23,6 +23,8 @@ along with GCC; see the file COPYING3. If not see extern const char *get_c_stdlib_header_for_name (const char *name); extern const char *get_cp_stdlib_header_for_name (const char *name); +extern const char *get_c_stdlib_header_for_string_macro_name (const char *n); + /* Subclass of deferred_diagnostic for suggesting to the user that they have missed a #include. */ diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 5d11e7e73c16..6b2ae5688a72 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #include "c-family/name-hint.h" #include "tree-iterator.h" #include "memmodel.h" +#include "c-family/known-headers.h" /* We need to walk over decls with incomplete struct/union/enum types after parsing the whole translation unit. @@ -223,6 +224,13 @@ struct GTY(()) c_parser { keywords are valid. */ BOOL_BITFIELD objc_property_attr_context : 1; + /* Whether we have just seen/constructed a string-literal. Set when + returning a string-literal from c_parser_string_literal. Reset + in consume_token. Useful when we get a parse error and see an + unknown token, which could have been a string-literal constant + macro. */ + BOOL_BITFIELD seen_string_literal : 1; + /* Location of the last consumed token. */ location_t last_token_location; }; @@ -853,6 +861,7 @@ c_parser_consume_token (c_parser *parser) } } parser->tokens_avail--; + parser->seen_string_literal = false; } /* Expect the current token to be a #pragma. Consume it and remember @@ -966,6 +975,25 @@ c_parser_error_richloc (c_parser *parser, const char *gmsgid, } } + /* If we were parsing a string-literal and there is an unknown name + token right after, then check to see if that could also have been + a literal string by checking the name against a list of known + standard string literal constants defined in header files. If + there is one, then add that as an hint to the error message. */ + auto_diagnostic_group d; + name_hint h; + if (parser->seen_string_literal && token->type == CPP_NAME) + { + tree name = token->value; + const char *header_hint + = get_c_stdlib_header_for_name (IDENTIFIER_POINTER (name)); + if (header_hint != NULL) + h = name_hint (NULL, + new suggest_missing_header (token->location, + IDENTIFIER_POINTER (name), + header_hint)); + } + c_parse_error (gmsgid, /* Because c_parse_error does not understand CPP_KEYWORD, keywords are treated like @@ -7539,6 +7567,7 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) ret.original_code = STRING_CST; ret.original_type = NULL_TREE; set_c_expr_source_range (&ret, get_range_from_loc (line_table, loc)); + parser->seen_string_literal = true; return ret; } diff --git a/gcc/testsuite/g++.dg/spellcheck-inttypes.C b/gcc/testsuite/g++.dg/spellcheck-inttypes.C new file mode 100644 index 000000000000..c5861127ca6d --- /dev/null +++ b/gcc/testsuite/g++.dg/spellcheck-inttypes.C @@ -0,0 +1,41 @@ +/* { dg-options "-std=c++11" } */ +#include <cstdio> +#include <cstdint> +/* Missing <cinttypes>. */ + +int8_t i8; +int16_t i16; +int32_t i32; +int64_t i64; + +intptr_t ip; +uintptr_t up; + +/* As an identifier. */ +const char *hex8_fmt = PRIx8; /* { dg-error "'PRIx8' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx8' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex16_fmt = PRIx16; /* { dg-error "'PRIx16' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx16' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex32_fmt = PRIx32; /* { dg-error "'PRIx32' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx32' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex64_fmt = PRIx64; /* { dg-error "'PRIx64' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx64' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hexptr_fmt = PRIxPTR; /* { dg-error "'PRIxPTR' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIxPTR' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + +void test_printf (void) +{ + printf ("some format strings %s, %s, %s, %s, %s, %s\n", + PRId8, /* { dg-error "'PRId8' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRId8' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + PRIi16, /* { dg-error "'PRIi16' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIi16' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + PRIo32, /* { dg-error "'PRIo32' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIo32' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + PRIu64, /* { dg-error "'PRIu64' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIu64' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + PRIx32, /* { dg-error "'PRIx32' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx32' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ + PRIoPTR); /* { dg-error "'PRIoPTR' was not declared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIoPTR' is defined in header '<cinttypes>'; did you forget to '#include <cinttypes>'?" "replacement note" { target *-*-* } .-1 } */ +} diff --git a/gcc/testsuite/gcc.dg/spellcheck-inttypes.c b/gcc/testsuite/gcc.dg/spellcheck-inttypes.c new file mode 100644 index 000000000000..56dc2f3dcfd9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/spellcheck-inttypes.c @@ -0,0 +1,78 @@ +/* { dg-options "-std=c99" } */ +#include <stdio.h> +#include <stdint.h> +/* Missing <inttypes.h>. */ + +int8_t i8; +int16_t i16; +int32_t i32; +int64_t i64; + +intptr_t ip; +uintptr_t up; + +/* As an identifier. */ +const char *hex8_fmt = PRIx8; /* { dg-error "'PRIx8' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx8' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex16_fmt = PRIx16; /* { dg-error "'PRIx16' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx16' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex32_fmt = PRIx32; /* { dg-error "'PRIx32' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx32' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hex64_fmt = PRIx64; /* { dg-error "'PRIx64' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIx64' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *hexptr_fmt = PRIxPTR; /* { dg-error "'PRIxPTR' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIxPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + +/* As a part of a string-literal. */ +const char *dec8msg_fmt = "Provide %" PRId8 "\n"; /* { dg-error "expected" "expected string-literal" { target *-*-* } } */ +/* { dg-message "'PRId8' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *dec16msg_fmt = "Provide %" PRId16 "\n"; /* { dg-error "expected" "expected string-literal" { target *-*-* } } */ +/* { dg-message "'PRId16' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *dec32msg_fmt = "Provide %" PRId32 "\n"; /* { dg-error "expected" "expected string-literal" { target *-*-* } } */ +/* { dg-message "'PRId32' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *dec64msg_fmt = "Provide %" PRId64 "\n"; /* { dg-error "expected" "expected string-literal" { target *-*-* } } */ +/* { dg-message "'PRId64' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +const char *decptrmsg_fmt = "Provide %" PRIdPTR "\n"; /* { dg-error "expected" "expected string-literal" { target *-*-* } } */ +/* { dg-message "'PRIdPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + +void test_printf (void) +{ + printf ("some format strings %s, %s, %s, %s, %s, %s\n", + PRId8, /* { dg-error "'PRId8' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRId8' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + PRIi16, /* { dg-error "'PRIi16' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIi16' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + PRIo32, /* { dg-error "'PRIo32' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIo32' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + PRIu64, /* { dg-error "'PRIu64' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIu64' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + PRIoPTR); /* { dg-error "'PRIoPTR' undeclared" "undeclared identifier" { target *-*-* } } */ +/* { dg-message "'PRIoPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + + printf ("%" PRIo8 "\n", i8); /* { dg-error "expected" } */ +/* { dg-message "'PRIo8' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + printf ("%" PRIo16 "\n", i16); /* { dg-error "expected" } */ +/* { dg-message "'PRIo16' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + printf ("%" PRIo32 "\n", i32); /* { dg-error "expected" } */ +/* { dg-message "'PRIo32' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + printf ("%" PRIo64 "\n", i64); /* { dg-error "expected" } */ +/* { dg-message "'PRIo64' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + printf ("%" PRIoPTR "\n", ip); /* { dg-error "expected" } */ +/* { dg-message "'PRIoPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +} + +void test_scanf (void) +{ + scanf ("%" SCNu8 "\n", &i8); /* { dg-error "expected" } */ +/* { dg-message "'SCNu8' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + scanf ("%" SCNu16 "\n", &i16); /* { dg-error "expected" } */ +/* { dg-message "'SCNu16' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + scanf ("%" SCNu32 "\n", &i32); /* { dg-error "expected" } */ +/* { dg-message "'SCNu32' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + scanf ("%" SCNu64 "\n", &i64); /* { dg-error "expected" } */ +/* { dg-message "'SCNu64' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + scanf ("%" SCNuPTR "\n", &ip); /* { dg-error "expected" } */ +/* { dg-message "'SCNuPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ + scanf ("%" SCNxPTR "\n", &up); /* { dg-error "expected" } */ +/* { dg-message "'SCNxPTR' is defined in header '<inttypes.h>'; did you forget to '#include <inttypes.h>'?" "replacement note" { target *-*-* } .-1 } */ +} -- 2.20.1