Hello, In PR53690, a UCN is incorrectly interpreted in C++11 mode. The value should be 0 but is converted to 1 by libcpp U'\U00000000'.
The reason is that _cpp_valid_ucn converts all 0 results to 1, by default. I am not 100% sure why that is (there is no comment and the code has been like that since the initial checkin). In C99 characters below 0xa0 are not allowed, so perhaps _cpp_valid_ucn returned 1 for a 0 UCN because it's invalid in C and it was deemed better to return a non-NULL character as an error than '\0' In any case, it's valid for C++11. Jason modified charset.c to implement the C++11 change and handle the differences between C99 and C++11 (*) but I think he overlooked the two lines at the bottom that convert a 0 result to 1. The attached patch was bootstrapped&tested on powerpc64-unknown-linux-gnu. Does it make sense enough for an OK? :-) Ciao! Steven (*) see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2170.html and http://gcc.gnu.org/viewcvs?view=revision&revision=152614 libcpp/ PR preprocessor/53690 * charset (_cpp_valid_ucn): If result == 0, return 0, the C99 path for values < 0xa0 is handled earlier since r152614. testsuite/ PR preprocessor/53690 * g++.dg/pr53690.C: New test. Index: libcpp/charset.c =================================================================== --- libcpp/charset.c (revision 189358) +++ libcpp/charset.c (working copy) @@ -1071,9 +1071,6 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **p (int) (str - base), base); } - if (result == 0) - result = 1; - return result; } Index: gcc/testsuite/g++.dg/pr53690.C =================================================================== --- gcc/testsuite/g++.dg/pr53690.C (revision 0) +++ gcc/testsuite/g++.dg/pr53690.C (revision 0) @@ -0,0 +1,25 @@ +// { dg-do compile } +// { dg-options "-std=c++11" } + +extern "C" int printf (__const char *__restrict __format, ...); + +typedef unsigned short uint16_t; +typedef unsigned int uint32_t; + +int main() { + uint32_t a = U'\U00000000'; + uint32_t b = U'\u0000'; + uint32_t c = U'\x00'; + uint32_t d = U'\0'; + + uint16_t e = u'\U00000000'; + uint16_t f = u'\u0000'; + uint16_t g = u'\x00'; + uint16_t h = u'\0'; + + printf("%x %x %x %x %x %x %x %x\n", a, b, c, d, e, f, g, h); + + return 0; +} + +// { dg-final { scan-tree-dump-not "= 1" "original" } }