On Thursday, 14 March 2019 13:54:29 PDT NIkolai Marchenko wrote: > I've posted about this issue (I think) on slack a bit earlier, see > https://cpplang.slack.com/archives/C29936TQC/p1549899016010000
For those who can't read it, the suggestion was to use the /utf-8 option to the compiler (with qmake, CONFIG += utf8_source). But a quick set of testing does not show correct results. For char16_t text1[] = u"" "\u0102"; It produces, without /utf-8 (see https://msvc.godbolt.org/z/EvtKzq): ?text1@@3PA_SA DB '?', 00H, 00H, 00H ; text1 And with /utf-8: ?text1@@3PA_SA DB 0c4H, 00H, 01aH, ' ', 00H, 00H ; text1 Those two values make no sense. U+0102 is neither 0x003f (question mark) nor 0x00c4 0x201a ("Ä‚"). This is a clear compiler bug. An interpretation of the C++11 standard could say that the translation is correct for the no-/utf-8 build, but with /utf-8 or /execution-charset:utf-8 it should have produced the correct result. C++11 2.14.5 [lex.string]/13 (now 5.13.5/12 [1]) says: "If one string-literal has no encoding-prefix, it is treated as a string- literal of the same encoding-prefix as the other operand." In table 9: u"a" "b" is the same as u"ab" [1] http://eel.is/c++draft/lex.string#12 -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products _______________________________________________ Development mailing list [email protected] https://lists.qt-project.org/listinfo/development
