On Wed, 2016-08-03 at 09:59 -0600, Jeff Law wrote: > On 07/29/2016 03:42 PM, Joseph Myers wrote: > > On Tue, 26 Jul 2016, David Malcolm wrote: > > > > > This patch implements precise tracking of source locations for > > > the > > > individual chars within string literals, so that we can e.g. > > > underline > > > specific ranges in -Wformat diagnostics. It handles macros, > > > concatenated tokens, escaped characters etc. > > > > What if the string literal results from stringizing other tokens > > (which > > might have arisen in turn from macro expansion, including expansion > > of > > built-in macros not just those defined in source files, etc.)? > > "You don't > > get precise locations" would be a fine answer for such cases - > > provided > > there is good testsuite coverage of them to show they don't crash > > the > > compiler or underline nonsensical characters. > I think losing precise locations in some circumstances would be fine > as > well -- as long as we understand the limitations.
In v3 of the patch, this fails gracefully. > And, yes, crashing or underlining nonsensical characters would be > bad, The API in input.c is get_source_range_for_substring, which returns an error message (intended for us, rather than end-users); it is wrapped by this method in c-common.c: /* Attempt to determine the source range of the substring. If successful, return NULL and write the source range to *OUT_RANGE. Otherwise return an error message. Error messages are intended for GCC developers (to help debugging) rather than for end-users. */ const char * substring_loc::get_range (source_range *out_range) const > so it'd be obviously good to test some of that to ensure the > fallbacks > work as expected. As for test coverage, v2 and v3 of the kit add over a thousand lines of selftest code that heavily exercise string lexing, using the line_table_case machinery to run the tests with various interesting boundary conditions with line_table (e.g. near LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES). In terms of test coverage of the fallbacks, patch 2 of v3 of the kit directly exercises the substr_loc.get_range in gcc.dg/plugin/diagnostic_plugin_test_string_literals.c via gcc.dg/plugin/diagnostic-test-string-literals-1.c, and some of the tests there cover the failures, via: error_at (strloc, "unable to read substring range: %s", err); which we wouldn't do in a normal diagnostic (but which is appropriate for testing the machinery itself). Patch 3 of the v3 kit adds a format_warning_va function to c-format.c which is responsible for dealing with failures: https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00204.html Sadly the comment got a bit mangled by git in that patch due to the proximity to the deleted function location_column_from_byte_offset; here's an inline copy (after patch 4, which adds param CORRECTED_SUBSTRING for doing fix-it hints for bad format strings): /* Emit a warning governed by option OPT, using GMSGID as the format string and AP as its arguments. Attempt to obtain precise location information within a string literal from FMT_LOC. Case 1: if substring location is available, and is within the range of the format string itself, the primary location of the diagnostic is the substring range obtained from FMT_LOC, with the caret at the *end* of the substring range. For example: test.c:90:10: warning: problem with '%i' here [-Wformat=] printf ("hello %i", msg); ~^ Case 2: if the substring location is available, but is not within the range of the format string, the primary location is that of the format string, and an note is emitted showing the substring location. For example: test.c:90:10: warning: problem with '%i' here [-Wformat=] printf("hello " INT_FMT " world", msg); ^~~~~~~~~~~~~~~~~~~~~~~~~ test.c:19: note: format string is defined here #define INT_FMT "%i" ~^ Case 3: if precise substring information is unavailable, the primary location is that of the whole string passed to FMT_LOC's constructor. For example: test.c:90:10: warning: problem with '%i' here [-Wformat=] printf(fmt, msg); ^~~ For each of cases 1-3, if param_range is non-NULL, then it is used as a secondary range within the warning. For example, here it is used with case 1: test.c:90:16: warning: '%s' here but arg 2 has 'long' type [-Wformat=] printf ("foo %s bar", long_i + long_j); ~^ ~~~~~~~~~~~~~~~ and here with case 2: test.c:90:16: warning: '%s' here but arg 2 has 'long' type [-Wformat=] printf ("foo " STR_FMT " bar", long_i + long_j); ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~ test.c:89:16: note: format string is defined here #define STR_FMT "%s" ~^ and with case 3: test.c:90:10: warning: '%i' here, but arg 2 is "const char *' [-Wformat=] printf(fmt, msg); ^~~ ~~~ If CORRECTED_SUBSTRING is non-NULL, use it for cases 1 and 2 to provide a fix-it hint, suggesting that it should replace the text within the substring range. For example: test.c:90:10: warning: problem with '%i' here [-Wformat=] printf ("hello %i", msg); ~^ %s Return true if a warning was emitted, false otherwise. */ ATTRIBUTE_GCC_DIAG (5,0) static bool format_warning_va (const substring_loc &fmt_loc, source_range *param_range, const char *corrected_substring, int opt, const char *gmsgid, va_list *ap) etc Looking at patch 3, there's a fair amount of end-to-end testing in gcc.dg/format/diagnostic-ranges.c but it looks like I forgot to add an end-to-end test there of failure due to stringification; I can add one. Is the rest of the v3 patch kit reviewable? Thanks Dave