On Fri, 2024-10-18 at 13:58 -0400, Lewis Hyatt wrote: > On Fri, Oct 18, 2024 at 11:25 AM David Malcolm <dmalc...@redhat.com> > wrote: > > > if (!pfile->cb.diagnostic) > > > abort (); > > > - ret = pfile->cb.diagnostic (pfile, level, reason, richloc, > > > _(msgid), ap); > > > - > > > - return ret; > > > + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE) > > > + { > > > + rich_location rc2{pfile->line_table, pfile- > > > > diagnostic_override_loc}; > > > + return pfile->cb.diagnostic (pfile, level, reason, &rc2, > > > _(msgid), ap); > > > + } > > > > This will effectively override the primary location in the > > rich_location, but by using a second rich_location instance it will > > also ignore any secondary locations and fix-it hints. > > > > This might will be what we want here, but did you consider > > richloc.set_range (0, pfile->diagnostic_override_loc, > > SHOW_RANGE_WITH_CARET); > > to reset the primary location? > > > > Otherwise, looks good to me. > > > > [...snip...] > > > > Thanks > > Dave > > > > Thanks for taking a look! My thinking was, when libcpp produces > tokens > from a _Pragma string, basically every location_t that it generates > is > wrong and shouldn't be used. Because it doesn't actually set up the > line_map, it gets something random that's just sorta close to > reasonable. So I think it makes sense to discard fixits and secondary > locations too.
Fair enough. I think the patch is OK, then, assuming usual testing. > > libcpp does use rich_location pretty sparingly, but I suppose the > goal > is to use it more over time. We use one fixit hint for invalid > directive names currently, that one can't show up in a _Pragma > though. > Right now we do define an encoding_rich_location subclass for > escaping > unprintable bytes, which inherits rich_location and just adds a new > constructor to set the escape flag when it is created. You are > definitely right that this patch as of now loses that information. > > Here's a source that uses an improperly normalized character: > > _Pragma("ோ") > > Currently we output: > > t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [- > Wnormalized=] > 1 | _Pragma("<U+0BC7><U+0BBE>") > | ^~~~~~ > > With the patch we output: > t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [- > Wnormalized=] > 1 | _Pragma("ோ") > | ^~~~~~~ > > So the main location range is improved (it underlines all of _Pragma > instead of most of it), but we have indeed lost the intended feature > that the incorrect bytes are escaped on the source line. > > For this particular case I could improve it with a one line addition > like > > rc2.set_escape_on_output (richloc->escape_on_output_p ()); > > and that would actually handle all currently needed cases since we > don't use a lot of rich_locations in libcpp. It would just become > stale if some other option gets added to rich_location in the future > that we also should preserve. > I think to fix it in a fully general > way, it would be necessary to add a new interface to class > rich_location. It already has a method to delete all the fixit hints, > it would also need a method to delete all the ranges. Then we could > make a copy of the richloc and just delete everything we don't want > to > preserve. Do you have a preference one way or the other? Do we know which diagnostics are likely to be emitted when we override the location? I suspect that anywhere that's overriding the location is an awkward case where we're unlikely to have fix-it hints and secondary ranges. I don't have a strong preference here. > > By the way, your suggestion was to directly modify richloc... these > functions do take the richloc by non-const pointer, but is it OK to > modify it or is a copy needed? The functions like cpp_warning_at() > are > exposed in the public header file, although at the moment, all call > sites are within libcpp and don't look like they would notice if the > argument was modified. I wasn't sure what is the expected interface > here. The diagnostic-reporting functions take non-const rich_location because the Fortran frontend doesn't set the locations on its diagnostics, but instead uses formatting codes %C and %L which write back locations into the rich_location when pp_format is called. So there's some precedent for non-const rich_location, FWIW Thanks again for the patch. Dave