On Fri, Oct 18, 2024 at 11:25 AM David Malcolm <dmalc...@redhat.com> wrote:
> >    if (!pfile->cb.diagnostic)
> >      abort ();
> > -  ret = pfile->cb.diagnostic (pfile, level, reason, richloc,
> > _(msgid), ap);
> > -
> > -  return ret;
> > +  if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE)
> > +    {
> > +      rich_location rc2{pfile->line_table, pfile-
> > >diagnostic_override_loc};
> > +      return pfile->cb.diagnostic (pfile, level, reason, &rc2,
> > _(msgid), ap);
> > +    }
>
> This will effectively override the primary location in the
> rich_location, but by using a second rich_location instance it will
> also ignore any secondary locations and fix-it hints.
>
> This might will be what we want here, but did you consider
>   richloc.set_range (0, pfile->diagnostic_override_loc,
>                      SHOW_RANGE_WITH_CARET);
> to reset the primary location?
>
> Otherwise, looks good to me.
>
> [...snip...]
>
> Thanks
> Dave
>

Thanks for taking a look! My thinking was, when libcpp produces tokens
from a _Pragma string, basically every location_t that it generates is
wrong and shouldn't be used. Because it doesn't actually set up the
line_map, it gets something random that's just sorta close to
reasonable. So I think it makes sense to discard fixits and secondary
locations too.

libcpp does use rich_location pretty sparingly, but I suppose the goal
is to use it more over time. We use one fixit hint for invalid
directive names currently, that one can't show up in a _Pragma though.
Right now we do define an encoding_rich_location subclass for escaping
unprintable bytes, which inherits rich_location and just adds a new
constructor to set the escape flag when it is created. You are
definitely right that this patch as of now loses that information.

Here's a source that uses an improperly normalized character:

_Pragma("ோ")

Currently we output:

t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=]
    1 | _Pragma("<U+0BC7><U+0BBE>")
      | ^~~~~~

With the patch we output:
t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=]
    1 | _Pragma("ோ")
      | ^~~~~~~

So the main location range is improved (it underlines all of _Pragma
instead of most of it), but we have indeed lost the intended feature
that the incorrect bytes are escaped on the source line.

For this particular case I could improve it with a one line addition like

rc2.set_escape_on_output (richloc->escape_on_output_p ());

and that would actually handle all currently needed cases since we
don't use a lot of rich_locations in libcpp. It would just become
stale if some other option gets added to rich_location in the future
that we also should preserve. I think to fix it in a fully general
way, it would be necessary to add a new interface to class
rich_location. It already has a method to delete all the fixit hints,
it would also need a method to delete all the ranges. Then we could
make a copy of the richloc and just delete everything we don't want to
preserve. Do you have a preference one way or the other?

By the way, your suggestion was to directly modify richloc... these
functions do take the richloc by non-const pointer, but is it OK to
modify it or is a copy needed? The functions like cpp_warning_at() are
exposed in the public header file, although at the moment, all call
sites are within libcpp and don't look like they would notice if the
argument was modified. I wasn't sure what is the expected interface
here.

Thanks again...

-Lewis

Reply via email to