Re: [PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

David Malcolm Fri, 18 Oct 2024 15:22:00 -0700

On Fri, 2024-10-18 at 13:58 -0400, Lewis Hyatt wrote:
> On Fri, Oct 18, 2024 at 11:25 AM David Malcolm <dmalc...@redhat.com>
> wrote:
> > >    if (!pfile->cb.diagnostic)
> > >      abort ();
> > > -  ret = pfile->cb.diagnostic (pfile, level, reason, richloc,
> > > _(msgid), ap);
> > > -
> > > -  return ret;
> > > +  if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE)
> > > +    {
> > > +      rich_location rc2{pfile->line_table, pfile-
> > > > diagnostic_override_loc};
> > > +      return pfile->cb.diagnostic (pfile, level, reason, &rc2,
> > > _(msgid), ap);
> > > +    }
> > 
> > This will effectively override the primary location in the
> > rich_location, but by using a second rich_location instance it will
> > also ignore any secondary locations and fix-it hints.
> > 
> > This might will be what we want here, but did you consider
> >   richloc.set_range (0, pfile->diagnostic_override_loc,
> >                      SHOW_RANGE_WITH_CARET);
> > to reset the primary location?
> > 
> > Otherwise, looks good to me.
> > 
> > [...snip...]
> > 
> > Thanks
> > Dave
> > 
> 
> Thanks for taking a look! My thinking was, when libcpp produces
> tokens
> from a _Pragma string, basically every location_t that it generates
> is
> wrong and shouldn't be used. Because it doesn't actually set up the
> line_map, it gets something random that's just sorta close to
> reasonable. So I think it makes sense to discard fixits and secondary
> locations too.


Fair enough.  I think the patch is OK, then, assuming usual testing.

> 
> libcpp does use rich_location pretty sparingly, but I suppose the
> goal
> is to use it more over time. We use one fixit hint for invalid
> directive names currently, that one can't show up in a _Pragma
> though.
> Right now we do define an encoding_rich_location subclass for
> escaping
> unprintable bytes, which inherits rich_location and just adds a new
> constructor to set the escape flag when it is created. You are
> definitely right that this patch as of now loses that information.
> 
> Here's a source that uses an improperly normalized character:
> 
> _Pragma("ோ")
> 
> Currently we output:
> 
> t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-
> Wnormalized=]
>     1 | _Pragma("<U+0BC7><U+0BBE>")
>       | ^~~~~~
> 
> With the patch we output:
> t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-
> Wnormalized=]
>     1 | _Pragma("ோ")
>       | ^~~~~~~
> 
> So the main location range is improved (it underlines all of _Pragma
> instead of most of it), but we have indeed lost the intended feature
> that the incorrect bytes are escaped on the source line.
> 
> For this particular case I could improve it with a one line addition
> like
> 
> rc2.set_escape_on_output (richloc->escape_on_output_p ());
> 
> and that would actually handle all currently needed cases since we
> don't use a lot of rich_locations in libcpp. It would just become
> stale if some other option gets added to rich_location in the future
> that we also should preserve. 
> I think to fix it in a fully general
> way, it would be necessary to add a new interface to class
> rich_location. It already has a method to delete all the fixit hints,
> it would also need a method to delete all the ranges. Then we could
> make a copy of the richloc and just delete everything we don't want
> to
> preserve. Do you have a preference one way or the other?

Do we know which diagnostics are likely to be emitted when we override
the location?  I suspect that anywhere that's overriding the location
is an awkward case where we're unlikely to have fix-it hints and
secondary ranges.

I don't have a strong preference here.

> 
> By the way, your suggestion was to directly modify richloc... these
> functions do take the richloc by non-const pointer, but is it OK to
> modify it or is a copy needed? The functions like cpp_warning_at()
> are
> exposed in the public header file, although at the moment, all call
> sites are within libcpp and don't look like they would notice if the
> argument was modified. I wasn't sure what is the expected interface
> here.

The diagnostic-reporting functions take non-const rich_location because
the Fortran frontend doesn't set the locations on its diagnostics, but
instead uses formatting codes %C and %L which write back locations into
the rich_location when pp_format is called.  So there's some precedent
for non-const rich_location, FWIW

Thanks again for the patch.
Dave

Re: [PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

Reply via email to