And on the up side if they do ever give in and allow registration of family name characters we may get a wchar_t, schar_t lwchar_t and a llwchar_t :) just imagine a variable length 64bit char system.
Everything from Sumerian to Klingon in Unicode, though I think they already are, though not officially, or are being done, Oh god what I really want now is bash in klingon. :)) just imagine black blackround glaring green text. know what I'm doing tonight. check out ( shakes head in disbelief, while chuckling ) Ubuntu Klingon Translators https://launchpad.net/~ubuntu-l10n-tlh Expansion: Ubuntu Font should support pIqaD (Klingon) https://bugs.launchpad.net/ubuntu/+source/ubuntu-font-family-sources/+bug/650729 On 02/23/2012 04:54 AM, Eric Blake wrote: > On 02/22/2012 07:43 PM, John Kearney wrote: >> ^ caviot you can represent the full 0x10ffff in UTF-16, you just >> need 2 UTF-16 characters. check out the latest version of >> unicode.c for an example how. > > Yes, and Cygwin actually does this. > > A strict reading of POSIX states that wchar_t must be wide enough > for all supported characters, technically limiting things to just > the basic plane if you have 16-bit wchar_t and a POSIX-compliant > app. But cygwin has exploited a loophole in the POSIX wording - > POSIX does not require that all bit patterns are valid characters. > So the actual Cygwin implementation is that on paper, rather than > representing all 65536 patterns as valid characters, the values > used in surrogate halves (0xd800 to 0xdfff) are listed as > non-characters (so the use of them triggers undefined behavior per > POSIX), but actually using them treats them as surrogate pairs > (leading to the full Unicode character set, but reintroducing the > headaches that multibyte characters had with 'char', but now with > wchar_t, where you are back to dealing with variable-sized > character elements). > > Furthermore, the mess of 16-bit vs. 32-bit wchar_t is one of the > reasons why C11 has introduced two new character types, 16-bit and > 32-bit characters, designed to fully map to the full Unicode set, > regardless of what size wchar_t is. It will be interesting to see > how the next version of POSIX takes the additions of C11 and > retrofits the other wide-character functions in POSIX but not C99 > to handle the new character types. >