^ caviot you can represent the full 0x10ffff in UTF-16, you just need 2 UTF-16 characters. check out the latest version of unicode.c for an example how.
On 02/22/2012 11:32 PM, Eric Blake wrote: > On 02/22/2012 03:01 PM, Linda Walsh wrote: >> My question had to do with an unqualified wint_t not >> unsigned wint_t and what platform existed where an 'int' type or >> wide-int_t, was, without qualifiers, unsigned. I still would like >> to know -- and posix allows int/wide-ints to be unsigned without >> the unsigned keyword? > > 'int' is signed, and at least 16 bits (these days, it's usually 32). It > can also be written 'signed int'. > > 'unsigned int' is unsigned, and at least 16 bits (these days, it's > usually 32). > > 'wchar_t' is an arbitrary integral type, either signed or unsigned, and > capable of holding the value of all valid wide characters. It is > possible to define a system where wchar_t and char are identical > (limiting yourself to 256 valid characters), but that is not done in > practice. More common are platforms that use 65536 characters (only the > basic plane of Unicode) for 16 bits, or full Unicode (0 to 0x10ffff) for > 32 bits. Platforms that use 65536 characters and 16-bit wchar_t must > have wchar_t be unsigned; whereas platforms that have wchar_t wider than > the largest valid character can choose signed or unsigned with no impact. > > 'wint_t' is an arbitrary integral type, either signed or unsigned, at > least as wide as wchar_t, and capable of holding the value of all valid > wide characters and the sentinel WEOF. Like wchar_t, it may hold values > that are neither WEOF or valid characters; and in fact, it is more > likely to do so, since either wchar_t is saturated (all bit values are > valid characters) and thus wint_t is a wider type, or wchar_t is sparse > (as is the case with 32-bit wchar_t encoding Unicode), and the addition > of WEOF to the set does not plug in the remaining sparse values; but > using such values has unspecified results on any interface that takes a > wint_t. WEOF only has to be distinct, it does not have to be negative. > > Don't think of it as 'wide-int', rather, think of it as 'the integral > type that both contains wchar_t and WEOF'. You cannot write 'signed > wint_t' nor 'unsigned 'wint_t'. >