On 11/14/20 1:33 PM, Lewis Hyatt wrote:
> On Fri, Nov 13, 2020 at 5:27 PM Jeff Law <l...@redhat.com> wrote:
>>
>> On 1/14/20 5:05 PM, Lewis Hyatt wrote:
>>> Hello-
>>>
>>> I thought I might ping this short patch please, just in case it may
>>> make sense to include in GCC 10 along with the other UTF-8-related
>>> fixes to diagnostics. Thanks!
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html
>> This is fine for the trunk. Note that due to the changes to handle
>> tabs/control bytes will require this patch to be updated. It may be as
>> simple as moving the c = dw.next_byte() statement up.
>>
>>
>> Go ahead and do the necessary update and retest & repost the patch for
>> archival purposes. If you have commit privs, go ahead and commit the
>> updated patch, else indicate in the patch repost that someone needs to
>> apply it for you.
>>
>>
>> Thanks for your patience,
>>
>> Jeff
>>
>>
>>>> #1. diagnostic_show_locus() should be sure it will not corrupt output in
>>>> this way, regardless of what ranges it is given to work with.
>> Yes.
>>
>>
>>>> #2. libcpp should probably generate a range that includes the whole UTF-8
>>>> character. Actually in other ways the range seems not ideal, for example
>>>> if an invalid character appears in the middle of the identifier, the
>>>> diagnostic still points to the first byte of the identifier.
>> Probably. We haven't traditionally worried a lot about multitbyte
>> sequences, so I'm not surprised we're not handling them particularly well.
>>
>>
>>>> The attached patch fixes #1. It's essentially a one-line change, plus a
>>>> new selftest. Would you please have a look at it sometime? bootstrap
>>>> and testsuite were done on linux x86-64.
>>>>
>>>> Other questions that I have:
>>>>
>>>> - I am not quite clear when a selftest is preferred vs a dejagnu test. In
>>>> this case I stuck with the selftest because color diagnostics don't seem
>>>> to work well with dg-error etc, and it didn't seem worth creating a new
>>>> plugin-based test like g++.dg/plugin just for this. (I also considered
>>>> using the existing g++.dg plugin, but it seems this test should run for
>>>> gcc as well.)
>> It varies and there's cases that are fine in either and I suspect there
>> are many tests in the dejagnu suite that would be better as selftests --
>> selftests are a fairly new concept.
>>
>>
>> The guidance I would give is the more a particular test is tied to the
>> internals of the code, the more likely a selftest is the right
>> approach. THe more the test needs an end-to-end run through passes of
>> the compiler, the more it belongs in the dejagnu suite.
>>
>>
>>
>>>> - I wasn't sure if I should create a PR for an issue such as this, if
>>>> there is already a patch readily available. And if I did create a PR,
>>>> not sure if it's preferred to post the patch to gcc-patches, or as an
>>>> attachment to the PR.
>> We still prefer patches to go to gcc-patches -- I personally don't troll
>> BZ looking for attached patches.
>>
>>
>>>> - Does it seem worth me looking into #2? I think the patch to address #1 is
>>>> appropriate in any case, because it handles generically all potential
>>>> cases where this may arise, but still perhaps the ranges coming out of
>>>> libcpp could be improved?
>> I don't think it can hurt to look into the difficulty in addressing #2.
>>
>>
>> jeff
>>
> Thanks very much for the detailed comments, that's all very useful to
> me. This particular patch was subsumed by r11-2092, which added the
> support for tab expansion, since this whole function was redone and
> now handles multibyte correctly. Sorry I probably should have updated
> the thread for this old patch in addition to mentioning in the new
> one, to save you some time. I will try to take a look sometime at the
> ranges that libcpp outputs too. Thanks again!
No problem. I'm slogging my way through lots of old stuff. I can often
determine if something has been subsumed, but wasn't able to in this case.
Thanks,
Jeff