------- Additional Comments From joseph at codesourcery dot com 2005-02-22
02:22 -------
Subject: Re: UCNs not recognized in identifiers
(c++/c99)
On Mon, 21 Feb 2005, neil at daikokuya dot co dot uk wrote:
> jsm28 at gcc dot gnu dot org wrote:-
>
> > * The greedy algorithm applies for lexing UCNs: for example,
> > a\U0000000z is three preprocessing tokens {a}{\}{U0000000z} (and
> > shouldn't get a diagnostic on lexing, presuming macros are defined
> > such that the eventual token sequence is valid).
>
> I'm not sure I agree with this: it would seem to be unnecessary
> extra work; further I suspect the user would benefit from it being
> pointed out he entered an ill-formed UCN rather than something random
> from the front end complaining about an unexpected backslash.
>
> The only case where you wouldn't get a syntax error from the
> front end, or an invalid escape in a literal, is with -E. I'm
> not sure lexing to the letter of the standard is worthwhile in
> this case, as the standard doesn't discuss -E.
>
> If you have an example where a compiled program is acceptable
> with multiple lexing tokens then I would agree with you.
#define a b(
#define b(x) q
int a\U0000000z );
Greedy lexing is the standard as applied for other token types. I don't
think a difference here makes sense. _cpp_valid_ucn would need changing
so it doesn't give an error for incomplete UCNs in identifiers but instead
returns quietly.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449