Re: 16-bit wchar_t on Windows and Cygwin

2011-02-04 Thread Warren Young
On 2/2/2011 9:35 AM, Corinna Vinschen wrote: If only the one's who decided that wchar_t in Cygwin should have the same size as WCHAR_T in the underlying Windows would have thought twice about the implications... Cygwin 1.9? Or maybe 2.0, if it breaks ABIs? -- Problem reports: http://cy

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-03 Thread Bruno Haible
Corinna Vinschen wrote: > isn't wwchar_t equivalent to wint_t on all > platforms? On UCS-4 platforms sizeof(wint_t) == sizeof(wchar_t) == 4 > because there's no reason to make it bigger. On UCS-2 and UTF-16 > platforms sizeof(wint_t) == 4 because it must be able to hold EOF as > well. So, why no

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-03 Thread Corinna Vinschen
On Feb 3 01:12, Bruno Haible wrote: > Hi Eric, > > > I was asking: > > > > should wwchar_t (or xwchar_t, but not xchar_t) be 2-bytes on cygwin, but > > unlike the POSIX definition of wchar_t being always 1 character per > > unit, the new type is explicitly documented as being multi-unit on some

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Bruno Haible
Hi Eric, > I was asking: > > should wwchar_t (or xwchar_t, but not xchar_t) be 2-bytes on cygwin, but > unlike the POSIX definition of wchar_t being always 1 character per > unit, the new type is explicitly documented as being multi-unit on some > platforms but with sane semantics > > or should

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Eric Blake
On 02/02/2011 04:03 PM, Bruno Haible wrote: >> Are you thinking of making a sane wrapping around either 4-byte wchar_t >> or which maps to 2-byte wchar_t but sanely handles UTF-16 (which makes >> it a thin wrapper on both Linux and Cygwin, but needing more work on >> mingw), or are you thinking tha

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Bruno Haible
Hello Eric, > > Here's a new proposal: > > - Define a type 'wwchar_t' on all platforms, equivalent to uint32_t > > on Windows platforms and to 'wchar_t' otherwise. > > - Define functions 'mbrtowwc', 'iswwalpha', 'wwcwidth', and similar. > > Their definition will be a trivial redirectio

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Corinna Vinschen
On Feb 2 14:24, Eric Blake wrote: > [dropping coreutils at this point] > > On 02/02/2011 04:29 AM, Bruno Haible wrote: > > Good point. I agree then that overriding wchar_t should better not be > > done. > > > > Here's a new proposal: > > - Define a type 'wwchar_t' on all platforms, equivalent

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Eric Blake
[dropping coreutils at this point] On 02/02/2011 04:29 AM, Bruno Haible wrote: > Good point. I agree then that overriding wchar_t should better not be > done. > > Here's a new proposal: > - Define a type 'wwchar_t' on all platforms, equivalent to uint32_t > on Windows platforms and to 'wcha

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Andy Koppe
On 2 February 2011 16:35, Corinna Vinschen wrote: > On Feb  2 17:28, Corinna Vinschen wrote: >> On Feb  2 17:02, Bruno Haible wrote: >> > But if you say that the application should convert UTF-16 surrogates >> > to UTF-32 before calling iswalpha: That's certainly a requirement >> > for Cygwin 1.7.x

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Corinna Vinschen
On Feb 2 17:28, Corinna Vinschen wrote: > On Feb 2 17:02, Bruno Haible wrote: > > But if you say that the application should convert UTF-16 surrogates > > to UTF-32 before calling iswalpha: That's certainly a requirement > > for Cygwin 1.7.x application that want to support the entire Unicode > >

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Corinna Vinschen
Hi Bruno, On Feb 2 17:02, Bruno Haible wrote: > Hello Corinna, > > > And, please note the wording in SUSv4, for instance in > > http://calimero.vinschen.de/susv4/functions/iswalpha.html > > Likewise in POSIX:2008, at the URL > http://www.opengroup.org/onlinepubs/9699919799/functions/iswalpha.ht

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Bruno Haible
Hello Corinna, > And, please note the wording in SUSv4, for instance in > http://calimero.vinschen.de/susv4/functions/iswalpha.html Likewise in POSIX:2008, at the URL http://www.opengroup.org/onlinepubs/9699919799/functions/iswalpha.html > The wc argument is a wint_t, the value of which the ap

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Corinna Vinschen
On Feb 2 13:14, Corinna Vinschen wrote: > On Feb 2 12:29, Bruno Haible wrote: > > Hello Eric, > > > > > ... POSIX requires that 1 wchar_t corresponds to 1 character > > > ... > > > > What consequences does this have? > > > > > > > > 1) All code that uses the functions from (wide character >

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Corinna Vinschen
On Feb 2 12:29, Bruno Haible wrote: > Hello Eric, > > > ... POSIX requires that 1 wchar_t corresponds to 1 character > > ... > > > What consequences does this have? > > > > > > 1) All code that uses the functions from (wide character > > > classification and mapping) or wcwidth() malfunc

Re: 16-bit wchar_t on Windows and Cygwin

2011-02-02 Thread Bruno Haible
Hello Eric, > ... POSIX requires that 1 wchar_t corresponds to 1 character > ... > > What consequences does this have? > > > > 1) All code that uses the functions from (wide character > > classification and mapping) or wcwidth() malfunctions on strings that > > contains Unicode chara

Re: 16-bit wchar_t on Windows and Cygwin

2011-01-31 Thread Corinna Vinschen
On Jan 31 09:58, Eric Blake wrote: > > 2) Code that uses mbrtowc() or wcrtomb() is also likely to malfunction. > > On Cygwin >= 1.7 mbrtowc() and wcrtomb() is implemented in an > > intelligent > > but somewhat surprising way: wcrtomb() may return 0, that is, produce > > no > > ou

Re: 16-bit wchar_t on Windows and Cygwin

2011-01-31 Thread Eric Blake
[adding cygwin and coreutils for a wc issue] On 01/30/2011 07:04 PM, Bruno Haible wrote: > Hi, > > It is known for a long time that on native Windows, the wchar_t[] encoding on > strings is UTF-16. [1] Now, Corinna Vinschen has confirmed that it is the same > for Cygwin >= 1.7. [2] POSIX require