Hi Marc, Marc Nieper-Wißkirchen wrote: > On a system that supports at least C11, I can create an UTF8-encoded > literal string through: > > (uint8_t const *) u8"..." > > Could Gnulib abstract this into a macro so that substitutes for > systems that do not have u8 string literals can be provided. > > On a C11 system, we would have > > #define UTF8(s) ((uint8_t const *) u8 ## s) > > and similar definitions for UTF16 and UTF32.
Unfortunately, we cannot provide such macros. The reason is that the translation from the source file's encoding to UTF-8/UTF-16/UTF-32 must be done by the compiler, if you want to be able to write static uint8_t my_string[] = u8"Wißkirchen"; Your best bet is 1) For UTF-8 encoded strings, ensure that your source code is UTF-8 encoded, or use escapes, like in gnulib/tests/uniwidth/test-u8-width.c. 2) For UTF-16 encoded strings, which you'll need only on Windows, write L"Wißkirchen". Or use hex codes, like in gnulib/tests/uniwidth/test-u16-width.c. 3) Don't use UTF-32 encoded strings. Or use hex codes, like in gnulib/tests/uniwidth/test-u32-width.c. > Similarly, something like > > #define ASCII(s) (u8 ## s [0]) > > for pre-C2x systems would be nice so that ASCII("c") expands into the > ASCII code point of the character `c'. What's the point of this one? Why not just write 'c'? Bruno