On Wed, 2016-08-03 at 09:59 -0600, Jeff Law wrote:
> On 07/29/2016 03:42 PM, Joseph Myers wrote:
> > On Tue, 26 Jul 2016, David Malcolm wrote:
> > 
> > > This patch implements precise tracking of source locations for
> > > the
> > > individual chars within string literals, so that we can e.g.
> > > underline
> > > specific ranges in -Wformat diagnostics.  It handles macros,
> > > concatenated tokens, escaped characters etc.
> > 
> > What if the string literal results from stringizing other tokens
> > (which
> > might have arisen in turn from macro expansion, including expansion
> > of
> > built-in macros not just those defined in source files, etc.)? 
> >  "You don't
> > get precise locations" would be a fine answer for such cases -
> > provided
> > there is good testsuite coverage of them to show they don't crash
> > the
> > compiler or underline nonsensical characters.
> I think losing precise locations in some circumstances would be fine
> as 
> well -- as long as we understand the limitations.

In v3 of the patch, this fails gracefully.

> And, yes, crashing or underlining nonsensical characters would be
> bad, 

The API in input.c is get_source_range_for_substring, which returns an
error message (intended for us, rather than end-users); it is wrapped
by this method in c-common.c:

/* Attempt to determine the source range of the substring.
   If successful, return NULL and write the source range to *OUT_RANGE.
   Otherwise return an error message.  Error messages are intended
   for GCC developers (to help debugging) rather than for end-users.  */

const char *
substring_loc::get_range (source_range *out_range) const

> so it'd be obviously good to test some of that to ensure the
> fallbacks 
> work as expected.

As for test coverage, v2 and v3 of the kit add over a thousand lines of
selftest code that heavily exercise string lexing, using the
 line_table_case machinery to run the tests with various interesting
boundary conditions with line_table (e.g. near
 LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES).

In terms of test coverage of the fallbacks, patch 2 of v3 of the kit
directly exercises the substr_loc.get_range in 
gcc.dg/plugin/diagnostic_plugin_test_string_literals.c via
gcc.dg/plugin/diagnostic-test-string-literals-1.c, and some of the
tests there cover the failures, via:

  error_at (strloc, "unable to read substring range: %s", err);

which we wouldn't do in a normal diagnostic (but which is appropriate
for testing the machinery itself).

Patch 3 of the v3 kit adds a format_warning_va function to c-format.c
which is responsible for dealing with failures:
https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00204.html

Sadly the comment got a bit mangled by git in that patch due to the
proximity to the deleted function location_column_from_byte_offset;
here's an inline copy (after patch 4, which adds param
CORRECTED_SUBSTRING for doing fix-it hints for bad format strings):

/* Emit a warning governed by option OPT, using GMSGID as the format
   string and AP as its arguments.

   Attempt to obtain precise location information within a string
   literal from FMT_LOC.

   Case 1: if substring location is available, and is within the range of
   the format string itself, the primary location of the
   diagnostic is the substring range obtained from FMT_LOC, with the
   caret at the *end* of the substring range.

   For example:

     test.c:90:10: warning: problem with '%i' here [-Wformat=]
     printf ("hello %i", msg);
                    ~^

   Case 2: if the substring location is available, but is not within
   the range of the format string, the primary location is that of the
   format string, and an note is emitted showing the substring location.

   For example:
     test.c:90:10: warning: problem with '%i' here [-Wformat=]
     printf("hello " INT_FMT " world", msg);
            ^~~~~~~~~~~~~~~~~~~~~~~~~
     test.c:19: note: format string is defined here
     #define INT_FMT "%i"
                      ~^

   Case 3: if precise substring information is unavailable, the primary
   location is that of the whole string passed to FMT_LOC's constructor.
   For example:

     test.c:90:10: warning: problem with '%i' here [-Wformat=]
     printf(fmt, msg);
            ^~~

   For each of cases 1-3, if param_range is non-NULL, then it is used
   as a secondary range within the warning.  For example, here it
   is used with case 1:

     test.c:90:16: warning: '%s' here but arg 2 has 'long' type [-Wformat=]
     printf ("foo %s bar", long_i + long_j);
                  ~^       ~~~~~~~~~~~~~~~

   and here with case 2:

     test.c:90:16: warning: '%s' here but arg 2 has 'long' type [-Wformat=]
     printf ("foo " STR_FMT " bar", long_i + long_j);
             ^~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~
     test.c:89:16: note: format string is defined here
     #define STR_FMT "%s"
                      ~^

   and with case 3:

     test.c:90:10: warning: '%i' here, but arg 2 is "const char *' [-Wformat=]
     printf(fmt, msg);
            ^~~  ~~~

   If CORRECTED_SUBSTRING is non-NULL, use it for cases 1 and 2 to provide
   a fix-it hint, suggesting that it should replace the text within the
   substring range.  For example:

     test.c:90:10: warning: problem with '%i' here [-Wformat=]
     printf ("hello %i", msg);
                    ~^
                    %s

   Return true if a warning was emitted, false otherwise.  */

ATTRIBUTE_GCC_DIAG (5,0)
static bool
format_warning_va (const substring_loc &fmt_loc, source_range *param_range,
                   const char *corrected_substring,
                   int opt, const char *gmsgid, va_list *ap)

etc


Looking at patch 3, there's a fair amount of end-to-end testing in
 gcc.dg/format/diagnostic-ranges.c but it looks like I forgot to add an
end-to-end test there of failure due to stringification; I can add one.
 Is the rest of the v3 patch kit reviewable?

Thanks
Dave

Reply via email to