On 2023-03-24 06:18, Corinna Vinschen via Cygwin wrote:
On Mar 23 22:14, Corinna Vinschen via Cygwin wrote:
On Mar 23 15:48, Ken Brown via Cygwin wrote:
I'm reporting this here rather than the newlib list because the behavior is
compatible with Posix but not Linux, so I think it's a Cygwin issue.
Actually, it's a Windows issue :)
Consider the following test case:
$ cat locale_test.c
#include <stdio.h>
#include <locale.h>
int main ()
{
const char *locale = "en_DE.UTF-8";
locale_t loc = newlocale (LC_COLLATE_MASK | LC_CTYPE_MASK, locale, 0);
if (!loc)
perror ("newlocale");
else
printf ("newlocale succeeded on invalid locale %s\n", locale);
}
$ gcc -o locale_test locale_test.c
$ ./locale_test.exe
newlocale succeeded on invalid locale en_DE.UTF-8
On Linux, the newlocale call fails with ENOENT, as is documented on the man
page. Posix doesn't say what should happen on an invalid locale, so this is
not, strictly speaking, a bug.
Three bugs in fact.
First, it's a bug in the Emacs testsuite. The test simply assumes that
there's no en_DE locale on any system, but that's just not true.
Windows support the RFC 5646 locale "en-DE", which is called "English
(Germany)" in the "Region" settings.
You can also check with `locale -av | less' and search for en_DE.
For the reminder of this mail, I assume you're talking about Cygwin 3.5.
I won't fix this for 3.4 anymore, given how much locale handling has
changed for 3.5.
The second bug is that Cygwin blindly trusts the Windows function
ResolveLocaleName(). That function blatantly converts even vaguely
similar locales into something it supports. E.g., it converts "en-XY"
to "en-US". I. .e., even if you use "en_XY.utf8" as locale, the above
testcase will wrongly succeed. So I have to rethink how I resolve POSIX
locales to Windows locales.
Does Windows even consider https://www.rfc-editor.org/rfc/rfc4647 "Matching of
Language Tags", part of https://www.rfc-editor.org/info/bcp47 "Language Tags",
and if POSIX only matches exactly, will LANGUAGE be able to be used for fallback?
I currently define LANGUAGE=en_CA:en_GB:en in case en-CA is unsupported by
anything.
[I use my own en-CA locale not the glibc default created by https://rap.dk/.]
Will "-" be supported like "_" as a separator in values?
And the third bug is that Cygwin fails to set errno if it doesn't
support a locale, but that's a minor inconvenience in comparison.
Thanks for the report, I totally missed the above problem with
ResolveLocaleName.
I pushed a couple of patches which hopefully clean up the code. It's
really frustrating how these Windows locale functions work. Or, rather,
not work. I mean, come on...
- ResolveLocaleName() resolves "ff-BF" to "ff-Latn-SN", not to
"ff-Adlm-BF" or "ff-Latn-BF", even though both exist.
- There's a locale called "sd-Arab-PK" and a locale "sd-Deva-IN". If
you ask for the script used in "sd-IN", the result is "Arab", not
"Deva".
>
I had to create a replacement function for ResolveLocaleName which
doesn't return totally screwy and unexpected results, and special case
two more locales in /proc/locales output so the output makes sense.
Aha - a nice new 3.5.0 feature - as well as /proc/codesets - is that charsets
e.g. ISO-10646, etc. rather than encodings e.g. UTF-8, etc.!
FYI Google fixed their English L14N falling back to en-GB except US territories:
https://developer.android.com/guide/topics/resources/multilingual-support#postN
https://issuetracker.google.com/issues/64429534#comment6
and there have been similar issues posted for other languages.
Oh, and I added error handling to the code so newlocale is now able to
set errno to ENOENT if the locale is not supported.
If you want to test this, the changes are in test release
3.5.0-0.260.gb5b67a65f87c, which is just building.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple