On Fri, Nov 13, 2020 at 5:27 PM Jeff Law <l...@redhat.com> wrote: > > > On 1/14/20 5:05 PM, Lewis Hyatt wrote: > > Hello- > > > > I thought I might ping this short patch please, just in case it may > > make sense to include in GCC 10 along with the other UTF-8-related > > fixes to diagnostics. Thanks! > > > > https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html > > This is fine for the trunk. Note that due to the changes to handle > tabs/control bytes will require this patch to be updated. It may be as > simple as moving the c = dw.next_byte() statement up. > > > Go ahead and do the necessary update and retest & repost the patch for > archival purposes. If you have commit privs, go ahead and commit the > updated patch, else indicate in the patch repost that someone needs to > apply it for you. > > > Thanks for your patience, > > Jeff > > > >> #1. diagnostic_show_locus() should be sure it will not corrupt output in > >> this way, regardless of what ranges it is given to work with. > > Yes. > > > >> > >> #2. libcpp should probably generate a range that includes the whole UTF-8 > >> character. Actually in other ways the range seems not ideal, for example > >> if an invalid character appears in the middle of the identifier, the > >> diagnostic still points to the first byte of the identifier. > > Probably. We haven't traditionally worried a lot about multitbyte > sequences, so I'm not surprised we're not handling them particularly well. > > > >> > >> The attached patch fixes #1. It's essentially a one-line change, plus a > >> new selftest. Would you please have a look at it sometime? bootstrap > >> and testsuite were done on linux x86-64. > >> > >> Other questions that I have: > >> > >> - I am not quite clear when a selftest is preferred vs a dejagnu test. In > >> this case I stuck with the selftest because color diagnostics don't seem > >> to work well with dg-error etc, and it didn't seem worth creating a new > >> plugin-based test like g++.dg/plugin just for this. (I also considered > >> using the existing g++.dg plugin, but it seems this test should run for > >> gcc as well.) > > It varies and there's cases that are fine in either and I suspect there > are many tests in the dejagnu suite that would be better as selftests -- > selftests are a fairly new concept. > > > The guidance I would give is the more a particular test is tied to the > internals of the code, the more likely a selftest is the right > approach. THe more the test needs an end-to-end run through passes of > the compiler, the more it belongs in the dejagnu suite. > > > > >> > >> - I wasn't sure if I should create a PR for an issue such as this, if > >> there is already a patch readily available. And if I did create a PR, > >> not sure if it's preferred to post the patch to gcc-patches, or as an > >> attachment to the PR. > > We still prefer patches to go to gcc-patches -- I personally don't troll > BZ looking for attached patches. > > > >> > >> - Does it seem worth me looking into #2? I think the patch to address #1 is > >> appropriate in any case, because it handles generically all potential > >> cases where this may arise, but still perhaps the ranges coming out of > >> libcpp could be improved? > > I don't think it can hurt to look into the difficulty in addressing #2. > > > jeff >
Thanks very much for the detailed comments, that's all very useful to me. This particular patch was subsumed by r11-2092, which added the support for tab expansion, since this whole function was redone and now handles multibyte correctly. Sorry I probably should have updated the thread for this old patch in addition to mentioning in the new one, to save you some time. I will try to take a look sometime at the ranges that libcpp outputs too. Thanks again! -Lewis