Hello,

In PR53690, a UCN is incorrectly interpreted in C++11 mode. The value
should be 0 but is converted to 1 by libcpp U'\U00000000'.

The reason is that _cpp_valid_ucn converts all 0 results to 1, by
default. I am not 100% sure why that is (there is no comment and the
code has been like that since the initial checkin). In C99 characters
below 0xa0 are not allowed, so perhaps _cpp_valid_ucn returned 1 for a
0 UCN because it's invalid in C and it was deemed better to return a
non-NULL character as an error than '\0'

In any case, it's valid for C++11. Jason modified charset.c to
implement the C++11 change and handle the differences between C99 and
C++11 (*) but I think he overlooked the two lines at the bottom that
convert a 0 result to 1.

The attached patch was bootstrapped&tested on
powerpc64-unknown-linux-gnu. Does it make sense enough for an OK? :-)

Ciao!
Steven



(*) see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2170.html
and http://gcc.gnu.org/viewcvs?view=revision&revision=152614

libcpp/
        PR preprocessor/53690
        * charset (_cpp_valid_ucn): If result == 0, return 0, the C99
        path for values < 0xa0 is handled earlier since r152614.

testsuite/
        PR preprocessor/53690
        *  g++.dg/pr53690.C: New test.

Index: libcpp/charset.c
===================================================================
--- libcpp/charset.c    (revision 189358)
+++ libcpp/charset.c    (working copy)
@@ -1071,9 +1071,6 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **p
                   (int) (str - base), base);
     }

-  if (result == 0)
-    result = 1;
-
   return result;
 }

Index: gcc/testsuite/g++.dg/pr53690.C
===================================================================
--- gcc/testsuite/g++.dg/pr53690.C      (revision 0)
+++ gcc/testsuite/g++.dg/pr53690.C      (revision 0)
@@ -0,0 +1,25 @@
+// { dg-do compile }
+// { dg-options "-std=c++11" }
+
+extern "C" int printf (__const char *__restrict __format, ...);
+
+typedef unsigned short uint16_t;
+typedef unsigned int uint32_t;
+
+int main() {
+    uint32_t a = U'\U00000000';
+    uint32_t b = U'\u0000';
+    uint32_t c = U'\x00';
+    uint32_t d = U'\0';
+
+    uint16_t e = u'\U00000000';
+    uint16_t f = u'\u0000';
+    uint16_t g = u'\x00';
+    uint16_t h = u'\0';
+
+    printf("%x %x %x %x %x %x %x %x\n", a, b, c, d, e, f, g, h);
+
+    return 0;
+}
+
+// { dg-final { scan-tree-dump-not "= 1" "original" } }

Reply via email to