Lasse Collin reported in
<https://lists.gnu.org/archive/html/bug-gettext/2024-12/msg00111.html>
that the setlocale() override from GNU libintl does not support the
UTF-8 environment of native Windows correctly. That setlocale() override
is based on the setlocale() override from gnulib. So let me add that
support here.
What I call the "UTF-8 environment of native Windows" is a way of
packaging an application (details are in [1]) in such a way that
GetACP() return 65001, the codepage number for UTF-8.
In fact, there are apparently two variants of this mode:
- the legacy Windows settings variant: when you haven't ever
(or recently?) changed the system default locale of Windows 10,
- the modern Windows settings variant: when you have changed
the system default locale of Windows 10.
With the legacy Windows settings, the setlocale() function produces
locale names such as "English_United States.65001" or
"English_United States.utf8". With the modern Windows settings, it
produces "en_US.UTF-8" instead. (This is with both mingw and MSVC,
according to my testing.)
The various locale-related modules of gnulib were never tested in
the UTF-8 environment. This series of patches adds support for it,
with unit tests.
[1]
<https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page>
2024-12-23 Bruno Haible <[email protected]>
mbrtowc tests: Test in the UTF-8 environment on native Windows.
* tests/test-mbrtowc-w32utf8.sh: New file.
* tests/test-mbrtowc-w32utf8.c: New file.
* modules/mbrtowc-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-mbrtowc-w32utf8 and run
test-mbrtowc-w32utf8.sh.
2024-12-23 Bruno Haible <[email protected]>
setlocale tests: Test in the UTF-8 environment on native Windows.
* tests/test-setlocale-w32utf8.sh: New file.
* tests/test-setlocale-w32utf8.c: New file.
* modules/setlocale-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-setlocale-w32utf8 and run
test-setlocale-w32utf8.sh.
setlocale: Support the UTF-8 environment on native Windows.
* lib/setlocale.c: Include <windows.h>.
(setlocale_unixlike): In the UTF-8 environment, append a suffix ".65001"
to the locale names passed to the native setlocale().
2024-12-23 Bruno Haible <[email protected]>
localename tests: Test in the UTF-8 environment on native Windows.
* tests/test-localename-w32utf8.sh: New file.
* tests/test-localename-w32utf8.c: New file.
* modules/localename-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-localename-w32utf8 and run
test-localename-w32utf8.sh.
localename-unsafe: Support the UTF-8 environment on native Windows.
* lib/localename-unsafe.c (gl_locale_name_from_win32_LANGID): Append a
suffix ".UTF-8" to the result if GetACP() is UTF-8.
2024-12-23 Bruno Haible <[email protected]>
localcharset tests: Test in the UTF-8 environment on native Windows.
* m4/windows-rc.m4: New file.
* tests/test-localcharset-w32utf8.sh: New file.
* tests/test-localcharset-w32utf8.c: New file.
* tests/windows-utf8.rc: New file.
* tests/windows-utf8.manifest: New file.
* modules/localcharset-tests (Files): Add these files.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-localcharset-w32utf8 and run
test-localcharset-w32utf8.sh.
localcharset: Support the UTF-8 environment on native Windows.
* lib/localcharset.c (locale_charset): Recognize also the special case
of a setlocale() result that ends in ".UTF-8".
>From 927a70e0853345315570f051fd6996cfeb7b4d96 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:56:15 +0100
Subject: [PATCH 1/7] localcharset: Support the UTF-8 environment on native
Windows.
* lib/localcharset.c (locale_charset): Recognize also the special case
of a setlocale() result that ends in ".UTF-8".
---
ChangeLog | 6 ++++++
lib/localcharset.c | 6 ++++--
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index c294898828..1ac323da3e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-12-23 Bruno Haible <[email protected]>
+
+ localcharset: Support the UTF-8 environment on native Windows.
+ * lib/localcharset.c (locale_charset): Recognize also the special case
+ of a setlocale() result that ends in ".UTF-8".
+
2024-12-23 Bruno Haible <[email protected]>
setlocale tests: Add unit test for LC_MESSAGES handling.
diff --git a/lib/localcharset.c b/lib/localcharset.c
index bd3367477d..755645763d 100644
--- a/lib/localcharset.c
+++ b/lib/localcharset.c
@@ -939,8 +939,10 @@ locale_charset (void)
sprintf (buf, "CP%u", GetACP ());
}
/* For a locale name such as "French_France.65001", in Windows 10,
- setlocale now returns "French_France.utf8" instead. */
- if (strcmp (buf + 2, "65001") == 0 || strcmp (buf + 2, "utf8") == 0)
+ setlocale now returns "French_France.utf8" instead, or in the UTF-8
+ environment (with modern system settings) "fr_FR.UTF-8". */
+ if (strcmp (buf + 2, "65001") == 0 || strcmp (buf + 2, "utf8") == 0
+ || strcmp (buf + 2, "UTF-8") == 0)
codeset = "UTF-8";
else
{
--
2.43.0
>From a5c87eca2b85c624582eabeb6b409dc6fb50bfbd Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:56:37 +0100
Subject: [PATCH 2/7] localcharset tests: Test in the UTF-8 environment on
native Windows.
* m4/windows-rc.m4: New file.
* tests/test-localcharset-w32utf8.sh: New file.
* tests/test-localcharset-w32utf8.c: New file.
* tests/windows-utf8.rc: New file.
* tests/windows-utf8.manifest: New file.
* modules/localcharset-tests (Files): Add these files.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-localcharset-w32utf8 and run
test-localcharset-w32utf8.sh.
---
ChangeLog | 12 ++++++
m4/windows-rc.m4 | 21 ++++++++++
modules/localcharset-tests | 16 ++++++++
tests/test-localcharset-w32utf8.c | 61 ++++++++++++++++++++++++++++++
tests/test-localcharset-w32utf8.sh | 7 ++++
tests/windows-utf8.manifest | 20 ++++++++++
tests/windows-utf8.rc | 9 +++++
7 files changed, 146 insertions(+)
create mode 100644 m4/windows-rc.m4
create mode 100644 tests/test-localcharset-w32utf8.c
create mode 100755 tests/test-localcharset-w32utf8.sh
create mode 100644 tests/windows-utf8.manifest
create mode 100644 tests/windows-utf8.rc
diff --git a/ChangeLog b/ChangeLog
index 1ac323da3e..bb9f076353 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,17 @@
2024-12-23 Bruno Haible <[email protected]>
+ localcharset tests: Test in the UTF-8 environment on native Windows.
+ * m4/windows-rc.m4: New file.
+ * tests/test-localcharset-w32utf8.sh: New file.
+ * tests/test-localcharset-w32utf8.c: New file.
+ * tests/windows-utf8.rc: New file.
+ * tests/windows-utf8.manifest: New file.
+ * modules/localcharset-tests (Files): Add these files.
+ (Depends-on): Add test-xfail.
+ (configure.ac): Invoke gl_WINDOWS_RC.
+ (Makefile.am): Arrange to compile test-localcharset-w32utf8 and run
+ test-localcharset-w32utf8.sh.
+
localcharset: Support the UTF-8 environment on native Windows.
* lib/localcharset.c (locale_charset): Recognize also the special case
of a setlocale() result that ends in ".UTF-8".
diff --git a/m4/windows-rc.m4 b/m4/windows-rc.m4
new file mode 100644
index 0000000000..8a4deb14b8
--- /dev/null
+++ b/m4/windows-rc.m4
@@ -0,0 +1,21 @@
+# windows-rc.m4
+# serial 1
+dnl Copyright (C) 2024 Free Software Foundation, Inc.
+dnl This file is free software; the Free Software Foundation
+dnl gives unlimited permission to copy and/or distribute it,
+dnl with or without modifications, as long as this notice is preserved.
+dnl This file is offered as-is, without any warranty.
+
+dnl Find the tool that "compiles" a Windows resource file (.rc) to an
+dnl object file.
+
+AC_DEFUN_ONCE([gl_WINDOWS_RC],
+[
+ AC_REQUIRE([AC_CANONICAL_HOST])
+ case "$host_os" in
+ mingw* | windows*)
+ dnl Check for a program that compiles Windows resource files.
+ AC_CHECK_TOOL([WINDRES], [windres])
+ ;;
+ esac
+])
diff --git a/modules/localcharset-tests b/modules/localcharset-tests
index 3f2dde6dfd..a171c0cfbf 100644
--- a/modules/localcharset-tests
+++ b/modules/localcharset-tests
@@ -1,11 +1,27 @@
Files:
tests/test-localcharset.c
+tests/test-localcharset-w32utf8.sh
+tests/test-localcharset-w32utf8.c
+tests/windows-utf8.rc
+tests/windows-utf8.manifest
+m4/windows-rc.m4
Depends-on:
setlocale
+test-xfail
configure.ac:
+gl_WINDOWS_RC
Makefile.am:
noinst_PROGRAMS += test-localcharset
test_localcharset_LDADD = $(LDADD) $(SETLOCALE_LIB)
+
+if OS_IS_NATIVE_WINDOWS
+TESTS += test-localcharset-w32utf8.sh
+noinst_PROGRAMS += test-localcharset-w32utf8
+test_localcharset_w32utf8_LDADD = $(LDADD) test-localcharset-windows-utf8.res $(SETLOCALE_LIB)
+test-localcharset-windows-utf8.res : $(srcdir)/windows-utf8.rc
+ $(WINDRES) -i $(srcdir)/windows-utf8.rc -o test-localcharset-windows-utf8.res --output-format=coff
+MOSTLYCLEANFILES += test-localcharset-windows-utf8.res
+endif
diff --git a/tests/test-localcharset-w32utf8.c b/tests/test-localcharset-w32utf8.c
new file mode 100644
index 0000000000..f40db9c397
--- /dev/null
+++ b/tests/test-localcharset-w32utf8.c
@@ -0,0 +1,61 @@
+/* Test of localcharset() function
+ on native Windows in the UTF-8 environment.
+ Copyright (C) 2024 Free Software Foundation, Inc.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <https://www.gnu.org/licenses/>. */
+
+/* Written by Bruno Haible <[email protected]>, 2024. */
+
+#include <config.h>
+
+#include "localcharset.h"
+
+#include <locale.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define WIN32_LEAN_AND_MEAN
+#include <windows.h>
+
+int
+main (void)
+{
+#ifdef _UCRT
+ unsigned int active_codepage = GetACP ();
+ if (!(active_codepage == 65001))
+ {
+ fprintf (stderr,
+ "The active codepage is %u, not 65001 as expected.\n"
+ "(This is normal on Windows older than Windows 10.)\n",
+ active_codepage);
+ exit (1);
+ }
+
+ setlocale (LC_ALL, "");
+ const char *lc = locale_charset ();
+ if (!(strcmp (lc, "UTF-8") == 0))
+ {
+ fprintf (stderr,
+ "locale_charset () is \"%s\", not \"UTF-8\" as expected.\n",
+ lc);
+ exit (1);
+ }
+
+ return 0;
+#else
+ fputs ("Skipping test: not using the UCRT runtime\n", stderr);
+ return 77;
+#endif
+}
diff --git a/tests/test-localcharset-w32utf8.sh b/tests/test-localcharset-w32utf8.sh
new file mode 100755
index 0000000000..1e6a95b545
--- /dev/null
+++ b/tests/test-localcharset-w32utf8.sh
@@ -0,0 +1,7 @@
+#!/bin/sh
+
+# Test the UTF-8 environment on native Windows.
+unset LC_ALL
+unset LC_CTYPE
+unset LANG
+${CHECKER} ./test-localcharset-w32utf8${EXEEXT}
diff --git a/tests/windows-utf8.manifest b/tests/windows-utf8.manifest
new file mode 100644
index 0000000000..3a43a70c6d
--- /dev/null
+++ b/tests/windows-utf8.manifest
@@ -0,0 +1,20 @@
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<!-- This file is in the public domain. -->
+
+<!-- This file is an application manifest that has the effect that in the
+ application, GetACP () == 65001 instead of e.g. 1252.
+ Documentation:
+ https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#activeCodePage
+ https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
+ XML schema that this file needs to obey:
+ https://learn.microsoft.com/en-us/windows/win32/sbscs/manifest-file-schema
+ It is supposed to work in Windows 10 version 1903 or newer,
+ when the UCRT runtime is in use (as opposed to old MSVCRT).
+-->
+<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
+ <application xmlns="urn:schemas-microsoft-com:asm.v3">
+ <windowsSettings>
+ <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
+ </windowsSettings>
+ </application>
+</assembly>
diff --git a/tests/windows-utf8.rc b/tests/windows-utf8.rc
new file mode 100644
index 0000000000..110241aa16
--- /dev/null
+++ b/tests/windows-utf8.rc
@@ -0,0 +1,9 @@
+/* This file is in the public domain. */
+
+/* This file is a resource definition file.
+ When compiled to an object file, it embeds the windows-utf8.manifest file,
+ that has the effect that in the application, GetACP () == 65001 instead
+ of e.g. 1252. */
+
+#include <winresrc.h> /* includes <winuser.h>, <winver.h> */
+CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "windows-utf8.manifest"
--
2.43.0
>From 9f7ff4f423cd805866cd4edef806c32393621df0 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:56:57 +0100
Subject: [PATCH 3/7] localename-unsafe: Support the UTF-8 environment on
native Windows.
* lib/localename-unsafe.c (gl_locale_name_from_win32_LANGID): Append a
suffix ".UTF-8" to the result if GetACP() is UTF-8.
---
ChangeLog | 6 +
lib/localename-unsafe.c | 848 ++++++++++++++++++++--------------------
2 files changed, 433 insertions(+), 421 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index bb9f076353..d9f282c21e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-12-23 Bruno Haible <[email protected]>
+
+ localename-unsafe: Support the UTF-8 environment on native Windows.
+ * lib/localename-unsafe.c (gl_locale_name_from_win32_LANGID): Append a
+ suffix ".UTF-8" to the result if GetACP() is UTF-8.
+
2024-12-23 Bruno Haible <[email protected]>
localcharset tests: Test in the UTF-8 environment on native Windows.
diff --git a/lib/localename-unsafe.c b/lib/localename-unsafe.c
index 0a2654d8a3..7088616892 100644
--- a/lib/localename-unsafe.c
+++ b/lib/localename-unsafe.c
@@ -1502,6 +1502,8 @@ static
const char *
gl_locale_name_from_win32_LANGID (LANGID langid)
{
+ int is_utf8 = (GetACP () == 65001);
+
/* Activate the new code only when the GETTEXT_MUI environment variable is
set, for the time being, since the new code is not well tested. */
if (getenv ("GETTEXT_MUI") != NULL)
@@ -1512,10 +1514,12 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
On Windows95/98/ME, GetLocaleInfoA returns some incorrect results.
But we don't need to support systems that are so old. */
if (GetLocaleInfoA (MAKELCID (langid, SORT_DEFAULT), LOCALE_SNAME,
- namebuf, sizeof (namebuf) - 1))
+ namebuf, sizeof (namebuf) - 1 - 6))
{
/* Convert it to a Unix locale name. */
gl_locale_name_canonicalize (namebuf);
+ if (is_utf8)
+ strcat (namebuf, ".UTF-8");
return namebuf;
}
}
@@ -1525,6 +1529,7 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
Windows base (e.g. they have different character conversion facilities
that produce different results). */
/* Use our own table. */
+ #define N(name) (is_utf8 ? name ".UTF-8" : name)
{
int primary, sub;
@@ -1540,146 +1545,146 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
case LANG_AFRIKAANS:
switch (sub)
{
- case SUBLANG_AFRIKAANS_SOUTH_AFRICA: return "af_ZA";
+ case SUBLANG_AFRIKAANS_SOUTH_AFRICA: return N("af_ZA");
}
- return "af";
+ return N("af");
case LANG_ALBANIAN:
switch (sub)
{
- case SUBLANG_ALBANIAN_ALBANIA: return "sq_AL";
+ case SUBLANG_ALBANIAN_ALBANIA: return N("sq_AL");
}
- return "sq";
+ return N("sq");
case LANG_ALSATIAN:
switch (sub)
{
- case SUBLANG_ALSATIAN_FRANCE: return "gsw_FR";
+ case SUBLANG_ALSATIAN_FRANCE: return N("gsw_FR");
}
- return "gsw";
+ return N("gsw");
case LANG_AMHARIC:
switch (sub)
{
- case SUBLANG_AMHARIC_ETHIOPIA: return "am_ET";
+ case SUBLANG_AMHARIC_ETHIOPIA: return N("am_ET");
}
- return "am";
+ return N("am");
case LANG_ARABIC:
switch (sub)
{
- case SUBLANG_ARABIC_SAUDI_ARABIA: return "ar_SA";
- case SUBLANG_ARABIC_IRAQ: return "ar_IQ";
- case SUBLANG_ARABIC_EGYPT: return "ar_EG";
- case SUBLANG_ARABIC_LIBYA: return "ar_LY";
- case SUBLANG_ARABIC_ALGERIA: return "ar_DZ";
- case SUBLANG_ARABIC_MOROCCO: return "ar_MA";
- case SUBLANG_ARABIC_TUNISIA: return "ar_TN";
- case SUBLANG_ARABIC_OMAN: return "ar_OM";
- case SUBLANG_ARABIC_YEMEN: return "ar_YE";
- case SUBLANG_ARABIC_SYRIA: return "ar_SY";
- case SUBLANG_ARABIC_JORDAN: return "ar_JO";
- case SUBLANG_ARABIC_LEBANON: return "ar_LB";
- case SUBLANG_ARABIC_KUWAIT: return "ar_KW";
- case SUBLANG_ARABIC_UAE: return "ar_AE";
- case SUBLANG_ARABIC_BAHRAIN: return "ar_BH";
- case SUBLANG_ARABIC_QATAR: return "ar_QA";
- }
- return "ar";
+ case SUBLANG_ARABIC_SAUDI_ARABIA: return N("ar_SA");
+ case SUBLANG_ARABIC_IRAQ: return N("ar_IQ");
+ case SUBLANG_ARABIC_EGYPT: return N("ar_EG");
+ case SUBLANG_ARABIC_LIBYA: return N("ar_LY");
+ case SUBLANG_ARABIC_ALGERIA: return N("ar_DZ");
+ case SUBLANG_ARABIC_MOROCCO: return N("ar_MA");
+ case SUBLANG_ARABIC_TUNISIA: return N("ar_TN");
+ case SUBLANG_ARABIC_OMAN: return N("ar_OM");
+ case SUBLANG_ARABIC_YEMEN: return N("ar_YE");
+ case SUBLANG_ARABIC_SYRIA: return N("ar_SY");
+ case SUBLANG_ARABIC_JORDAN: return N("ar_JO");
+ case SUBLANG_ARABIC_LEBANON: return N("ar_LB");
+ case SUBLANG_ARABIC_KUWAIT: return N("ar_KW");
+ case SUBLANG_ARABIC_UAE: return N("ar_AE");
+ case SUBLANG_ARABIC_BAHRAIN: return N("ar_BH");
+ case SUBLANG_ARABIC_QATAR: return N("ar_QA");
+ }
+ return N("ar");
case LANG_ARMENIAN:
switch (sub)
{
- case SUBLANG_ARMENIAN_ARMENIA: return "hy_AM";
+ case SUBLANG_ARMENIAN_ARMENIA: return N("hy_AM");
}
- return "hy";
+ return N("hy");
case LANG_ASSAMESE:
switch (sub)
{
- case SUBLANG_ASSAMESE_INDIA: return "as_IN";
+ case SUBLANG_ASSAMESE_INDIA: return N("as_IN");
}
- return "as";
+ return N("as");
case LANG_AZERI:
switch (sub)
{
- case 0x1e: return "az";
- case SUBLANG_AZERI_LATIN: return "az_AZ";
- case 0x1d: return "az@cyrillic";
- case SUBLANG_AZERI_CYRILLIC: return "az_AZ@cyrillic";
+ case 0x1e: return N("az");
+ case SUBLANG_AZERI_LATIN: return N("az_AZ");
+ case 0x1d: return N("az@cyrillic");
+ case SUBLANG_AZERI_CYRILLIC: return N("az_AZ@cyrillic");
}
- return "az";
+ return N("az");
case LANG_BASHKIR:
switch (sub)
{
- case SUBLANG_BASHKIR_RUSSIA: return "ba_RU";
+ case SUBLANG_BASHKIR_RUSSIA: return N("ba_RU");
}
- return "ba";
+ return N("ba");
case LANG_BASQUE:
switch (sub)
{
- case SUBLANG_BASQUE_BASQUE: return "eu_ES";
+ case SUBLANG_BASQUE_BASQUE: return N("eu_ES");
}
- return "eu"; /* Ambiguous: could be "eu_ES" or "eu_FR". */
+ return N("eu"); /* Ambiguous: could be "eu_ES" or "eu_FR". */
case LANG_BELARUSIAN:
switch (sub)
{
- case SUBLANG_BELARUSIAN_BELARUS: return "be_BY";
+ case SUBLANG_BELARUSIAN_BELARUS: return N("be_BY");
}
- return "be";
+ return N("be");
case LANG_BENGALI:
switch (sub)
{
- case SUBLANG_BENGALI_INDIA: return "bn_IN";
- case SUBLANG_BENGALI_BANGLADESH: return "bn_BD";
+ case SUBLANG_BENGALI_INDIA: return N("bn_IN");
+ case SUBLANG_BENGALI_BANGLADESH: return N("bn_BD");
}
- return "bn";
+ return N("bn");
case LANG_BRETON:
switch (sub)
{
- case SUBLANG_BRETON_FRANCE: return "br_FR";
+ case SUBLANG_BRETON_FRANCE: return N("br_FR");
}
- return "br";
+ return N("br");
case LANG_BULGARIAN:
switch (sub)
{
- case SUBLANG_BULGARIAN_BULGARIA: return "bg_BG";
+ case SUBLANG_BULGARIAN_BULGARIA: return N("bg_BG");
}
- return "bg";
+ return N("bg");
case LANG_BURMESE:
switch (sub)
{
- case SUBLANG_DEFAULT: return "my_MM";
+ case SUBLANG_DEFAULT: return N("my_MM");
}
- return "my";
+ return N("my");
case LANG_CAMBODIAN:
switch (sub)
{
- case SUBLANG_CAMBODIAN_CAMBODIA: return "km_KH";
+ case SUBLANG_CAMBODIAN_CAMBODIA: return N("km_KH");
}
- return "km";
+ return N("km");
case LANG_CATALAN:
switch (sub)
{
- case SUBLANG_CATALAN_SPAIN: return "ca_ES";
+ case SUBLANG_CATALAN_SPAIN: return N("ca_ES");
}
- return "ca";
+ return N("ca");
case LANG_CHEROKEE:
switch (sub)
{
- case SUBLANG_DEFAULT: return "chr_US";
+ case SUBLANG_DEFAULT: return N("chr_US");
}
- return "chr";
+ return N("chr");
case LANG_CHINESE:
switch (sub)
{
- case SUBLANG_CHINESE_TRADITIONAL: case 0x1f: return "zh_TW";
- case SUBLANG_CHINESE_SIMPLIFIED: case 0x00: return "zh_CN";
- case SUBLANG_CHINESE_HONGKONG: return "zh_HK"; /* traditional */
- case SUBLANG_CHINESE_SINGAPORE: return "zh_SG"; /* simplified */
- case SUBLANG_CHINESE_MACAU: return "zh_MO"; /* traditional */
+ case SUBLANG_CHINESE_TRADITIONAL: case 0x1f: return N("zh_TW");
+ case SUBLANG_CHINESE_SIMPLIFIED: case 0x00: return N("zh_CN");
+ case SUBLANG_CHINESE_HONGKONG: return N("zh_HK"); /* traditional */
+ case SUBLANG_CHINESE_SINGAPORE: return N("zh_SG"); /* simplified */
+ case SUBLANG_CHINESE_MACAU: return N("zh_MO"); /* traditional */
}
- return "zh";
+ return N("zh");
case LANG_CORSICAN:
switch (sub)
{
- case SUBLANG_CORSICAN_FRANCE: return "co_FR";
+ case SUBLANG_CORSICAN_FRANCE: return N("co_FR");
}
- return "co";
+ return N("co");
case LANG_CROATIAN: /* LANG_CROATIAN == LANG_SERBIAN == LANG_BOSNIAN
* What used to be called Serbo-Croatian
* should really now be two separate
@@ -1691,68 +1696,68 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
switch (sub)
{
/* Croatian */
- case 0x00: return "hr";
- case SUBLANG_CROATIAN_CROATIA: return "hr_HR";
- case SUBLANG_CROATIAN_BOSNIA_HERZEGOVINA_LATIN: return "hr_BA";
+ case 0x00: return N("hr");
+ case SUBLANG_CROATIAN_CROATIA: return N("hr_HR");
+ case SUBLANG_CROATIAN_BOSNIA_HERZEGOVINA_LATIN: return N("hr_BA");
/* Serbian */
- case 0x1f: return "sr";
- case 0x1c: return "sr"; /* latin */
- case SUBLANG_SERBIAN_LATIN: return "sr_CS"; /* latin */
- case 0x09: return "sr_RS"; /* latin */
- case 0x0b: return "sr_ME"; /* latin */
- case 0x06: return "sr_BA"; /* latin */
- case 0x1b: return "sr@cyrillic";
- case SUBLANG_SERBIAN_CYRILLIC: return "sr_CS@cyrillic";
- case 0x0a: return "sr_RS@cyrillic";
- case 0x0c: return "sr_ME@cyrillic";
- case 0x07: return "sr_BA@cyrillic";
+ case 0x1f: return N("sr");
+ case 0x1c: return N("sr"); /* latin */
+ case SUBLANG_SERBIAN_LATIN: return N("sr_CS"); /* latin */
+ case 0x09: return N("sr_RS"); /* latin */
+ case 0x0b: return N("sr_ME"); /* latin */
+ case 0x06: return N("sr_BA"); /* latin */
+ case 0x1b: return N("sr@cyrillic");
+ case SUBLANG_SERBIAN_CYRILLIC: return N("sr_CS@cyrillic");
+ case 0x0a: return N("sr_RS@cyrillic");
+ case 0x0c: return N("sr_ME@cyrillic");
+ case 0x07: return N("sr_BA@cyrillic");
/* Bosnian */
- case 0x1e: return "bs";
- case 0x1a: return "bs"; /* latin */
- case SUBLANG_BOSNIAN_BOSNIA_HERZEGOVINA_LATIN: return "bs_BA"; /* latin */
- case 0x19: return "bs@cyrillic";
- case SUBLANG_BOSNIAN_BOSNIA_HERZEGOVINA_CYRILLIC: return "bs_BA@cyrillic";
+ case 0x1e: return N("bs");
+ case 0x1a: return N("bs"); /* latin */
+ case SUBLANG_BOSNIAN_BOSNIA_HERZEGOVINA_LATIN: return N("bs_BA"); /* latin */
+ case 0x19: return N("bs@cyrillic");
+ case SUBLANG_BOSNIAN_BOSNIA_HERZEGOVINA_CYRILLIC: return N("bs_BA@cyrillic");
}
- return "hr";
+ return N("hr");
case LANG_CZECH:
switch (sub)
{
- case SUBLANG_CZECH_CZECH_REPUBLIC: return "cs_CZ";
+ case SUBLANG_CZECH_CZECH_REPUBLIC: return N("cs_CZ");
}
- return "cs";
+ return N("cs");
case LANG_DANISH:
switch (sub)
{
- case SUBLANG_DANISH_DENMARK: return "da_DK";
+ case SUBLANG_DANISH_DENMARK: return N("da_DK");
}
- return "da";
+ return N("da");
case LANG_DARI:
/* FIXME: Adjust this when such locales appear on Unix. */
switch (sub)
{
- case SUBLANG_DARI_AFGHANISTAN: return "prs_AF";
+ case SUBLANG_DARI_AFGHANISTAN: return N("prs_AF");
}
- return "prs";
+ return N("prs");
case LANG_DIVEHI:
switch (sub)
{
- case SUBLANG_DIVEHI_MALDIVES: return "dv_MV";
+ case SUBLANG_DIVEHI_MALDIVES: return N("dv_MV");
}
- return "dv";
+ return N("dv");
case LANG_DUTCH:
switch (sub)
{
- case SUBLANG_DUTCH: return "nl_NL";
- case SUBLANG_DUTCH_BELGIAN: /* FLEMISH, VLAAMS */ return "nl_BE";
- case SUBLANG_DUTCH_SURINAM: return "nl_SR";
+ case SUBLANG_DUTCH: return N("nl_NL");
+ case SUBLANG_DUTCH_BELGIAN: /* FLEMISH, VLAAMS */ return N("nl_BE");
+ case SUBLANG_DUTCH_SURINAM: return N("nl_SR");
}
- return "nl";
+ return N("nl");
case LANG_EDO:
switch (sub)
{
- case SUBLANG_DEFAULT: return "bin_NG";
+ case SUBLANG_DEFAULT: return N("bin_NG");
}
- return "bin";
+ return N("bin");
case LANG_ENGLISH:
switch (sub)
{
@@ -1760,541 +1765,541 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
* English was the language spoken in England.
* Oh well.
*/
- case SUBLANG_ENGLISH_US: return "en_US";
- case SUBLANG_ENGLISH_UK: return "en_GB";
- case SUBLANG_ENGLISH_AUS: return "en_AU";
- case SUBLANG_ENGLISH_CAN: return "en_CA";
- case SUBLANG_ENGLISH_NZ: return "en_NZ";
- case SUBLANG_ENGLISH_EIRE: return "en_IE";
- case SUBLANG_ENGLISH_SOUTH_AFRICA: return "en_ZA";
- case SUBLANG_ENGLISH_JAMAICA: return "en_JM";
- case SUBLANG_ENGLISH_CARIBBEAN: return "en_GD"; /* Grenada? */
- case SUBLANG_ENGLISH_BELIZE: return "en_BZ";
- case SUBLANG_ENGLISH_TRINIDAD: return "en_TT";
- case SUBLANG_ENGLISH_ZIMBABWE: return "en_ZW";
- case SUBLANG_ENGLISH_PHILIPPINES: return "en_PH";
- case SUBLANG_ENGLISH_INDONESIA: return "en_ID";
- case SUBLANG_ENGLISH_HONGKONG: return "en_HK";
- case SUBLANG_ENGLISH_INDIA: return "en_IN";
- case SUBLANG_ENGLISH_MALAYSIA: return "en_MY";
- case SUBLANG_ENGLISH_SINGAPORE: return "en_SG";
- }
- return "en";
+ case SUBLANG_ENGLISH_US: return N("en_US");
+ case SUBLANG_ENGLISH_UK: return N("en_GB");
+ case SUBLANG_ENGLISH_AUS: return N("en_AU");
+ case SUBLANG_ENGLISH_CAN: return N("en_CA");
+ case SUBLANG_ENGLISH_NZ: return N("en_NZ");
+ case SUBLANG_ENGLISH_EIRE: return N("en_IE");
+ case SUBLANG_ENGLISH_SOUTH_AFRICA: return N("en_ZA");
+ case SUBLANG_ENGLISH_JAMAICA: return N("en_JM");
+ case SUBLANG_ENGLISH_CARIBBEAN: return N("en_GD"); /* Grenada? */
+ case SUBLANG_ENGLISH_BELIZE: return N("en_BZ");
+ case SUBLANG_ENGLISH_TRINIDAD: return N("en_TT");
+ case SUBLANG_ENGLISH_ZIMBABWE: return N("en_ZW");
+ case SUBLANG_ENGLISH_PHILIPPINES: return N("en_PH");
+ case SUBLANG_ENGLISH_INDONESIA: return N("en_ID");
+ case SUBLANG_ENGLISH_HONGKONG: return N("en_HK");
+ case SUBLANG_ENGLISH_INDIA: return N("en_IN");
+ case SUBLANG_ENGLISH_MALAYSIA: return N("en_MY");
+ case SUBLANG_ENGLISH_SINGAPORE: return N("en_SG");
+ }
+ return N("en");
case LANG_ESTONIAN:
switch (sub)
{
- case SUBLANG_ESTONIAN_ESTONIA: return "et_EE";
+ case SUBLANG_ESTONIAN_ESTONIA: return N("et_EE");
}
- return "et";
+ return N("et");
case LANG_FAEROESE:
switch (sub)
{
- case SUBLANG_FAEROESE_FAROE_ISLANDS: return "fo_FO";
+ case SUBLANG_FAEROESE_FAROE_ISLANDS: return N("fo_FO");
}
- return "fo";
+ return N("fo");
case LANG_FARSI:
switch (sub)
{
- case SUBLANG_FARSI_IRAN: return "fa_IR";
+ case SUBLANG_FARSI_IRAN: return N("fa_IR");
}
- return "fa";
+ return N("fa");
case LANG_FINNISH:
switch (sub)
{
- case SUBLANG_FINNISH_FINLAND: return "fi_FI";
+ case SUBLANG_FINNISH_FINLAND: return N("fi_FI");
}
- return "fi";
+ return N("fi");
case LANG_FRENCH:
switch (sub)
{
- case SUBLANG_FRENCH: return "fr_FR";
- case SUBLANG_FRENCH_BELGIAN: /* WALLOON */ return "fr_BE";
- case SUBLANG_FRENCH_CANADIAN: return "fr_CA";
- case SUBLANG_FRENCH_SWISS: return "fr_CH";
- case SUBLANG_FRENCH_LUXEMBOURG: return "fr_LU";
- case SUBLANG_FRENCH_MONACO: return "fr_MC";
- case SUBLANG_FRENCH_WESTINDIES: return "fr"; /* Caribbean? */
- case SUBLANG_FRENCH_REUNION: return "fr_RE";
- case SUBLANG_FRENCH_CONGO: return "fr_CG";
- case SUBLANG_FRENCH_SENEGAL: return "fr_SN";
- case SUBLANG_FRENCH_CAMEROON: return "fr_CM";
- case SUBLANG_FRENCH_COTEDIVOIRE: return "fr_CI";
- case SUBLANG_FRENCH_MALI: return "fr_ML";
- case SUBLANG_FRENCH_MOROCCO: return "fr_MA";
- case SUBLANG_FRENCH_HAITI: return "fr_HT";
- }
- return "fr";
+ case SUBLANG_FRENCH: return N("fr_FR");
+ case SUBLANG_FRENCH_BELGIAN: /* WALLOON */ return N("fr_BE");
+ case SUBLANG_FRENCH_CANADIAN: return N("fr_CA");
+ case SUBLANG_FRENCH_SWISS: return N("fr_CH");
+ case SUBLANG_FRENCH_LUXEMBOURG: return N("fr_LU");
+ case SUBLANG_FRENCH_MONACO: return N("fr_MC");
+ case SUBLANG_FRENCH_WESTINDIES: return N("fr"); /* Caribbean? */
+ case SUBLANG_FRENCH_REUNION: return N("fr_RE");
+ case SUBLANG_FRENCH_CONGO: return N("fr_CG");
+ case SUBLANG_FRENCH_SENEGAL: return N("fr_SN");
+ case SUBLANG_FRENCH_CAMEROON: return N("fr_CM");
+ case SUBLANG_FRENCH_COTEDIVOIRE: return N("fr_CI");
+ case SUBLANG_FRENCH_MALI: return N("fr_ML");
+ case SUBLANG_FRENCH_MOROCCO: return N("fr_MA");
+ case SUBLANG_FRENCH_HAITI: return N("fr_HT");
+ }
+ return N("fr");
case LANG_FRISIAN:
switch (sub)
{
- case SUBLANG_FRISIAN_NETHERLANDS: return "fy_NL";
+ case SUBLANG_FRISIAN_NETHERLANDS: return N("fy_NL");
}
- return "fy";
+ return N("fy");
case LANG_FULFULDE:
/* Spoken in Nigeria, Guinea, Senegal, Mali, Niger, Cameroon, Benin. */
switch (sub)
{
- case SUBLANG_DEFAULT: return "ff_NG";
+ case SUBLANG_DEFAULT: return N("ff_NG");
}
- return "ff";
+ return N("ff");
case LANG_GAELIC:
switch (sub)
{
case 0x01: /* SCOTTISH */
/* old, superseded by LANG_SCOTTISH_GAELIC */
- return "gd_GB";
- case SUBLANG_IRISH_IRELAND: return "ga_IE";
+ return N("gd_GB");
+ case SUBLANG_IRISH_IRELAND: return N("ga_IE");
}
- return "ga";
+ return N("ga");
case LANG_GALICIAN:
switch (sub)
{
- case SUBLANG_GALICIAN_SPAIN: return "gl_ES";
+ case SUBLANG_GALICIAN_SPAIN: return N("gl_ES");
}
- return "gl";
+ return N("gl");
case LANG_GEORGIAN:
switch (sub)
{
- case SUBLANG_GEORGIAN_GEORGIA: return "ka_GE";
+ case SUBLANG_GEORGIAN_GEORGIA: return N("ka_GE");
}
- return "ka";
+ return N("ka");
case LANG_GERMAN:
switch (sub)
{
- case SUBLANG_GERMAN: return "de_DE";
- case SUBLANG_GERMAN_SWISS: return "de_CH";
- case SUBLANG_GERMAN_AUSTRIAN: return "de_AT";
- case SUBLANG_GERMAN_LUXEMBOURG: return "de_LU";
- case SUBLANG_GERMAN_LIECHTENSTEIN: return "de_LI";
+ case SUBLANG_GERMAN: return N("de_DE");
+ case SUBLANG_GERMAN_SWISS: return N("de_CH");
+ case SUBLANG_GERMAN_AUSTRIAN: return N("de_AT");
+ case SUBLANG_GERMAN_LUXEMBOURG: return N("de_LU");
+ case SUBLANG_GERMAN_LIECHTENSTEIN: return N("de_LI");
}
- return "de";
+ return N("de");
case LANG_GREEK:
switch (sub)
{
- case SUBLANG_GREEK_GREECE: return "el_GR";
+ case SUBLANG_GREEK_GREECE: return N("el_GR");
}
- return "el";
+ return N("el");
case LANG_GREENLANDIC:
switch (sub)
{
- case SUBLANG_GREENLANDIC_GREENLAND: return "kl_GL";
+ case SUBLANG_GREENLANDIC_GREENLAND: return N("kl_GL");
}
- return "kl";
+ return N("kl");
case LANG_GUARANI:
switch (sub)
{
- case SUBLANG_DEFAULT: return "gn_PY";
+ case SUBLANG_DEFAULT: return N("gn_PY");
}
- return "gn";
+ return N("gn");
case LANG_GUJARATI:
switch (sub)
{
- case SUBLANG_GUJARATI_INDIA: return "gu_IN";
+ case SUBLANG_GUJARATI_INDIA: return N("gu_IN");
}
- return "gu";
+ return N("gu");
case LANG_HAUSA:
switch (sub)
{
- case 0x1f: return "ha";
- case SUBLANG_HAUSA_NIGERIA_LATIN: return "ha_NG";
+ case 0x1f: return N("ha");
+ case SUBLANG_HAUSA_NIGERIA_LATIN: return N("ha_NG");
}
- return "ha";
+ return N("ha");
case LANG_HAWAIIAN:
/* FIXME: Do they mean Hawaiian ("haw_US", 1000 speakers)
or Hawaii Creole English ("cpe_US", 600000 speakers)? */
switch (sub)
{
- case SUBLANG_DEFAULT: return "cpe_US";
+ case SUBLANG_DEFAULT: return N("cpe_US");
}
- return "cpe";
+ return N("cpe");
case LANG_HEBREW:
switch (sub)
{
- case SUBLANG_HEBREW_ISRAEL: return "he_IL";
+ case SUBLANG_HEBREW_ISRAEL: return N("he_IL");
}
- return "he";
+ return N("he");
case LANG_HINDI:
switch (sub)
{
- case SUBLANG_HINDI_INDIA: return "hi_IN";
+ case SUBLANG_HINDI_INDIA: return N("hi_IN");
}
- return "hi";
+ return N("hi");
case LANG_HUNGARIAN:
switch (sub)
{
- case SUBLANG_HUNGARIAN_HUNGARY: return "hu_HU";
+ case SUBLANG_HUNGARIAN_HUNGARY: return N("hu_HU");
}
- return "hu";
+ return N("hu");
case LANG_IBIBIO:
switch (sub)
{
- case SUBLANG_DEFAULT: return "nic_NG";
+ case SUBLANG_DEFAULT: return N("nic_NG");
}
- return "nic";
+ return N("nic");
case LANG_ICELANDIC:
switch (sub)
{
- case SUBLANG_ICELANDIC_ICELAND: return "is_IS";
+ case SUBLANG_ICELANDIC_ICELAND: return N("is_IS");
}
- return "is";
+ return N("is");
case LANG_IGBO:
switch (sub)
{
- case SUBLANG_IGBO_NIGERIA: return "ig_NG";
+ case SUBLANG_IGBO_NIGERIA: return N("ig_NG");
}
- return "ig";
+ return N("ig");
case LANG_INDONESIAN:
switch (sub)
{
- case SUBLANG_INDONESIAN_INDONESIA: return "id_ID";
+ case SUBLANG_INDONESIAN_INDONESIA: return N("id_ID");
}
- return "id";
+ return N("id");
case LANG_INUKTITUT:
switch (sub)
{
- case 0x1e: return "iu"; /* syllabic */
- case SUBLANG_INUKTITUT_CANADA: return "iu_CA"; /* syllabic */
- case 0x1f: return "iu@latin";
- case SUBLANG_INUKTITUT_CANADA_LATIN: return "iu_CA@latin";
+ case 0x1e: return N("iu"); /* syllabic */
+ case SUBLANG_INUKTITUT_CANADA: return N("iu_CA"); /* syllabic */
+ case 0x1f: return N("iu@latin");
+ case SUBLANG_INUKTITUT_CANADA_LATIN: return N("iu_CA@latin");
}
- return "iu";
+ return N("iu");
case LANG_ITALIAN:
switch (sub)
{
- case SUBLANG_ITALIAN: return "it_IT";
- case SUBLANG_ITALIAN_SWISS: return "it_CH";
+ case SUBLANG_ITALIAN: return N("it_IT");
+ case SUBLANG_ITALIAN_SWISS: return N("it_CH");
}
- return "it";
+ return N("it");
case LANG_JAPANESE:
switch (sub)
{
- case SUBLANG_JAPANESE_JAPAN: return "ja_JP";
+ case SUBLANG_JAPANESE_JAPAN: return N("ja_JP");
}
- return "ja";
+ return N("ja");
case LANG_KANNADA:
switch (sub)
{
- case SUBLANG_KANNADA_INDIA: return "kn_IN";
+ case SUBLANG_KANNADA_INDIA: return N("kn_IN");
}
- return "kn";
+ return N("kn");
case LANG_KANURI:
switch (sub)
{
- case SUBLANG_DEFAULT: return "kr_NG";
+ case SUBLANG_DEFAULT: return N("kr_NG");
}
- return "kr";
+ return N("kr");
case LANG_KASHMIRI:
switch (sub)
{
- case SUBLANG_DEFAULT: return "ks_PK";
- case SUBLANG_KASHMIRI_INDIA: return "ks_IN";
+ case SUBLANG_DEFAULT: return N("ks_PK");
+ case SUBLANG_KASHMIRI_INDIA: return N("ks_IN");
}
- return "ks";
+ return N("ks");
case LANG_KAZAK:
switch (sub)
{
- case SUBLANG_KAZAK_KAZAKHSTAN: return "kk_KZ";
+ case SUBLANG_KAZAK_KAZAKHSTAN: return N("kk_KZ");
}
- return "kk";
+ return N("kk");
case LANG_KICHE:
/* FIXME: Adjust this when such locales appear on Unix. */
switch (sub)
{
- case SUBLANG_KICHE_GUATEMALA: return "qut_GT";
+ case SUBLANG_KICHE_GUATEMALA: return N("qut_GT");
}
- return "qut";
+ return N("qut");
case LANG_KINYARWANDA:
switch (sub)
{
- case SUBLANG_KINYARWANDA_RWANDA: return "rw_RW";
+ case SUBLANG_KINYARWANDA_RWANDA: return N("rw_RW");
}
- return "rw";
+ return N("rw");
case LANG_KONKANI:
switch (sub)
{
- case SUBLANG_KONKANI_INDIA: return "kok_IN";
+ case SUBLANG_KONKANI_INDIA: return N("kok_IN");
}
- return "kok";
+ return N("kok");
case LANG_KOREAN:
switch (sub)
{
- case SUBLANG_DEFAULT: return "ko_KR";
+ case SUBLANG_DEFAULT: return N("ko_KR");
}
- return "ko";
+ return N("ko");
case LANG_KYRGYZ:
switch (sub)
{
- case SUBLANG_KYRGYZ_KYRGYZSTAN: return "ky_KG";
+ case SUBLANG_KYRGYZ_KYRGYZSTAN: return N("ky_KG");
}
- return "ky";
+ return N("ky");
case LANG_LAO:
switch (sub)
{
- case SUBLANG_LAO_LAOS: return "lo_LA";
+ case SUBLANG_LAO_LAOS: return N("lo_LA");
}
- return "lo";
+ return N("lo");
case LANG_LATIN:
switch (sub)
{
- case SUBLANG_DEFAULT: return "la_VA";
+ case SUBLANG_DEFAULT: return N("la_VA");
}
- return "la";
+ return N("la");
case LANG_LATVIAN:
switch (sub)
{
- case SUBLANG_LATVIAN_LATVIA: return "lv_LV";
+ case SUBLANG_LATVIAN_LATVIA: return N("lv_LV");
}
- return "lv";
+ return N("lv");
case LANG_LITHUANIAN:
switch (sub)
{
- case SUBLANG_LITHUANIAN_LITHUANIA: return "lt_LT";
+ case SUBLANG_LITHUANIAN_LITHUANIA: return N("lt_LT");
}
- return "lt";
+ return N("lt");
case LANG_LUXEMBOURGISH:
switch (sub)
{
- case SUBLANG_LUXEMBOURGISH_LUXEMBOURG: return "lb_LU";
+ case SUBLANG_LUXEMBOURGISH_LUXEMBOURG: return N("lb_LU");
}
- return "lb";
+ return N("lb");
case LANG_MACEDONIAN:
switch (sub)
{
- case SUBLANG_MACEDONIAN_MACEDONIA: return "mk_MK";
+ case SUBLANG_MACEDONIAN_MACEDONIA: return N("mk_MK");
}
- return "mk";
+ return N("mk");
case LANG_MALAY:
switch (sub)
{
- case SUBLANG_MALAY_MALAYSIA: return "ms_MY";
- case SUBLANG_MALAY_BRUNEI_DARUSSALAM: return "ms_BN";
+ case SUBLANG_MALAY_MALAYSIA: return N("ms_MY");
+ case SUBLANG_MALAY_BRUNEI_DARUSSALAM: return N("ms_BN");
}
- return "ms";
+ return N("ms");
case LANG_MALAYALAM:
switch (sub)
{
- case SUBLANG_MALAYALAM_INDIA: return "ml_IN";
+ case SUBLANG_MALAYALAM_INDIA: return N("ml_IN");
}
- return "ml";
+ return N("ml");
case LANG_MALTESE:
switch (sub)
{
- case SUBLANG_MALTESE_MALTA: return "mt_MT";
+ case SUBLANG_MALTESE_MALTA: return N("mt_MT");
}
- return "mt";
+ return N("mt");
case LANG_MANIPURI:
switch (sub)
{
- case SUBLANG_DEFAULT: return "mni_IN";
+ case SUBLANG_DEFAULT: return N("mni_IN");
}
- return "mni";
+ return N("mni");
case LANG_MAORI:
switch (sub)
{
- case SUBLANG_MAORI_NEW_ZEALAND: return "mi_NZ";
+ case SUBLANG_MAORI_NEW_ZEALAND: return N("mi_NZ");
}
- return "mi";
+ return N("mi");
case LANG_MAPUDUNGUN:
switch (sub)
{
- case SUBLANG_MAPUDUNGUN_CHILE: return "arn_CL";
+ case SUBLANG_MAPUDUNGUN_CHILE: return N("arn_CL");
}
- return "arn";
+ return N("arn");
case LANG_MARATHI:
switch (sub)
{
- case SUBLANG_MARATHI_INDIA: return "mr_IN";
+ case SUBLANG_MARATHI_INDIA: return N("mr_IN");
}
- return "mr";
+ return N("mr");
case LANG_MOHAWK:
switch (sub)
{
- case SUBLANG_MOHAWK_CANADA: return "moh_CA";
+ case SUBLANG_MOHAWK_CANADA: return N("moh_CA");
}
- return "moh";
+ return N("moh");
case LANG_MONGOLIAN:
switch (sub)
{
- case SUBLANG_MONGOLIAN_CYRILLIC_MONGOLIA: case 0x1e: return "mn_MN";
- case SUBLANG_MONGOLIAN_PRC: case 0x1f: return "mn_CN";
+ case SUBLANG_MONGOLIAN_CYRILLIC_MONGOLIA: case 0x1e: return N("mn_MN");
+ case SUBLANG_MONGOLIAN_PRC: case 0x1f: return N("mn_CN");
}
- return "mn"; /* Ambiguous: could be "mn_CN" or "mn_MN". */
+ return N("mn"); /* Ambiguous: could be "mn_CN" or "mn_MN". */
case LANG_NEPALI:
switch (sub)
{
- case SUBLANG_NEPALI_NEPAL: return "ne_NP";
- case SUBLANG_NEPALI_INDIA: return "ne_IN";
+ case SUBLANG_NEPALI_NEPAL: return N("ne_NP");
+ case SUBLANG_NEPALI_INDIA: return N("ne_IN");
}
- return "ne";
+ return N("ne");
case LANG_NORWEGIAN:
switch (sub)
{
- case 0x1f: return "nb";
- case SUBLANG_NORWEGIAN_BOKMAL: return "nb_NO";
- case 0x1e: return "nn";
- case SUBLANG_NORWEGIAN_NYNORSK: return "nn_NO";
+ case 0x1f: return N("nb");
+ case SUBLANG_NORWEGIAN_BOKMAL: return N("nb_NO");
+ case 0x1e: return N("nn");
+ case SUBLANG_NORWEGIAN_NYNORSK: return N("nn_NO");
}
- return "no";
+ return N("no");
case LANG_OCCITAN:
switch (sub)
{
- case SUBLANG_OCCITAN_FRANCE: return "oc_FR";
+ case SUBLANG_OCCITAN_FRANCE: return N("oc_FR");
}
- return "oc";
+ return N("oc");
case LANG_ORIYA:
switch (sub)
{
- case SUBLANG_ORIYA_INDIA: return "or_IN";
+ case SUBLANG_ORIYA_INDIA: return N("or_IN");
}
- return "or";
+ return N("or");
case LANG_OROMO:
switch (sub)
{
- case SUBLANG_DEFAULT: return "om_ET";
+ case SUBLANG_DEFAULT: return N("om_ET");
}
- return "om";
+ return N("om");
case LANG_PAPIAMENTU:
switch (sub)
{
- case SUBLANG_DEFAULT: return "pap_AN";
+ case SUBLANG_DEFAULT: return N("pap_AN");
}
- return "pap";
+ return N("pap");
case LANG_PASHTO:
switch (sub)
{
- case SUBLANG_PASHTO_AFGHANISTAN: return "ps_AF";
+ case SUBLANG_PASHTO_AFGHANISTAN: return N("ps_AF");
}
- return "ps"; /* Ambiguous: could be "ps_PK" or "ps_AF". */
+ return N("ps"); /* Ambiguous: could be "ps_PK" or "ps_AF". */
case LANG_POLISH:
switch (sub)
{
- case SUBLANG_POLISH_POLAND: return "pl_PL";
+ case SUBLANG_POLISH_POLAND: return N("pl_PL");
}
- return "pl";
+ return N("pl");
case LANG_PORTUGUESE:
switch (sub)
{
/* Hmm. SUBLANG_PORTUGUESE_BRAZILIAN == SUBLANG_DEFAULT.
Same phenomenon as SUBLANG_ENGLISH_US == SUBLANG_DEFAULT. */
- case SUBLANG_PORTUGUESE_BRAZILIAN: return "pt_BR";
- case SUBLANG_PORTUGUESE: return "pt_PT";
+ case SUBLANG_PORTUGUESE_BRAZILIAN: return N("pt_BR");
+ case SUBLANG_PORTUGUESE: return N("pt_PT");
}
- return "pt";
+ return N("pt");
case LANG_PUNJABI:
switch (sub)
{
- case SUBLANG_PUNJABI_INDIA: return "pa_IN"; /* Gurmukhi script */
- case SUBLANG_PUNJABI_PAKISTAN: return "pa_PK"; /* Arabic script */
+ case SUBLANG_PUNJABI_INDIA: return N("pa_IN"); /* Gurmukhi script */
+ case SUBLANG_PUNJABI_PAKISTAN: return N("pa_PK"); /* Arabic script */
}
- return "pa";
+ return N("pa");
case LANG_QUECHUA:
/* Note: Microsoft uses the non-ISO language code "quz". */
switch (sub)
{
- case SUBLANG_QUECHUA_BOLIVIA: return "qu_BO";
- case SUBLANG_QUECHUA_ECUADOR: return "qu_EC";
- case SUBLANG_QUECHUA_PERU: return "qu_PE";
+ case SUBLANG_QUECHUA_BOLIVIA: return N("qu_BO");
+ case SUBLANG_QUECHUA_ECUADOR: return N("qu_EC");
+ case SUBLANG_QUECHUA_PERU: return N("qu_PE");
}
- return "qu";
+ return N("qu");
case LANG_ROMANIAN:
switch (sub)
{
- case SUBLANG_ROMANIAN_ROMANIA: return "ro_RO";
- case SUBLANG_ROMANIAN_MOLDOVA: return "ro_MD";
+ case SUBLANG_ROMANIAN_ROMANIA: return N("ro_RO");
+ case SUBLANG_ROMANIAN_MOLDOVA: return N("ro_MD");
}
- return "ro";
+ return N("ro");
case LANG_ROMANSH:
switch (sub)
{
- case SUBLANG_ROMANSH_SWITZERLAND: return "rm_CH";
+ case SUBLANG_ROMANSH_SWITZERLAND: return N("rm_CH");
}
- return "rm";
+ return N("rm");
case LANG_RUSSIAN:
switch (sub)
{
- case SUBLANG_RUSSIAN_RUSSIA: return "ru_RU";
- case SUBLANG_RUSSIAN_MOLDAVIA: return "ru_MD";
+ case SUBLANG_RUSSIAN_RUSSIA: return N("ru_RU");
+ case SUBLANG_RUSSIAN_MOLDAVIA: return N("ru_MD");
}
- return "ru"; /* Ambiguous: could be "ru_RU" or "ru_UA" or "ru_MD". */
+ return N("ru"); /* Ambiguous: could be "ru_RU" or "ru_UA" or "ru_MD". */
case LANG_SAMI:
switch (sub)
{
/* Northern Sami */
- case 0x00: return "se";
- case SUBLANG_SAMI_NORTHERN_NORWAY: return "se_NO";
- case SUBLANG_SAMI_NORTHERN_SWEDEN: return "se_SE";
- case SUBLANG_SAMI_NORTHERN_FINLAND: return "se_FI";
+ case 0x00: return N("se");
+ case SUBLANG_SAMI_NORTHERN_NORWAY: return N("se_NO");
+ case SUBLANG_SAMI_NORTHERN_SWEDEN: return N("se_SE");
+ case SUBLANG_SAMI_NORTHERN_FINLAND: return N("se_FI");
/* Lule Sami */
- case 0x1f: return "smj";
- case SUBLANG_SAMI_LULE_NORWAY: return "smj_NO";
- case SUBLANG_SAMI_LULE_SWEDEN: return "smj_SE";
+ case 0x1f: return N("smj");
+ case SUBLANG_SAMI_LULE_NORWAY: return N("smj_NO");
+ case SUBLANG_SAMI_LULE_SWEDEN: return N("smj_SE");
/* Southern Sami */
- case 0x1e: return "sma";
- case SUBLANG_SAMI_SOUTHERN_NORWAY: return "sma_NO";
- case SUBLANG_SAMI_SOUTHERN_SWEDEN: return "sma_SE";
+ case 0x1e: return N("sma");
+ case SUBLANG_SAMI_SOUTHERN_NORWAY: return N("sma_NO");
+ case SUBLANG_SAMI_SOUTHERN_SWEDEN: return N("sma_SE");
/* Skolt Sami */
- case 0x1d: return "sms";
- case SUBLANG_SAMI_SKOLT_FINLAND: return "sms_FI";
+ case 0x1d: return N("sms");
+ case SUBLANG_SAMI_SKOLT_FINLAND: return N("sms_FI");
/* Inari Sami */
- case 0x1c: return "smn";
- case SUBLANG_SAMI_INARI_FINLAND: return "smn_FI";
+ case 0x1c: return N("smn");
+ case SUBLANG_SAMI_INARI_FINLAND: return N("smn_FI");
}
- return "se"; /* or "smi"? */
+ return N("se"); /* or "smi"? */
case LANG_SANSKRIT:
switch (sub)
{
- case SUBLANG_SANSKRIT_INDIA: return "sa_IN";
+ case SUBLANG_SANSKRIT_INDIA: return N("sa_IN");
}
- return "sa";
+ return N("sa");
case LANG_SCOTTISH_GAELIC:
switch (sub)
{
- case SUBLANG_DEFAULT: return "gd_GB";
+ case SUBLANG_DEFAULT: return N("gd_GB");
}
- return "gd";
+ return N("gd");
case LANG_SINDHI:
switch (sub)
{
- case SUBLANG_SINDHI_INDIA: return "sd_IN";
- case SUBLANG_SINDHI_PAKISTAN: return "sd_PK";
- /*case SUBLANG_SINDHI_AFGHANISTAN: return "sd_AF";*/
+ case SUBLANG_SINDHI_INDIA: return N("sd_IN");
+ case SUBLANG_SINDHI_PAKISTAN: return N("sd_PK");
+ /*case SUBLANG_SINDHI_AFGHANISTAN: return N("sd_AF");*/
}
- return "sd";
+ return N("sd");
case LANG_SINHALESE:
switch (sub)
{
- case SUBLANG_SINHALESE_SRI_LANKA: return "si_LK";
+ case SUBLANG_SINHALESE_SRI_LANKA: return N("si_LK");
}
- return "si";
+ return N("si");
case LANG_SLOVAK:
switch (sub)
{
- case SUBLANG_SLOVAK_SLOVAKIA: return "sk_SK";
+ case SUBLANG_SLOVAK_SLOVAKIA: return N("sk_SK");
}
- return "sk";
+ return N("sk");
case LANG_SLOVENIAN:
switch (sub)
{
- case SUBLANG_SLOVENIAN_SLOVENIA: return "sl_SI";
+ case SUBLANG_SLOVENIAN_SLOVENIA: return N("sl_SI");
}
- return "sl";
+ return N("sl");
case LANG_SOMALI:
switch (sub)
{
- case SUBLANG_DEFAULT: return "so_SO";
+ case SUBLANG_DEFAULT: return N("so_SO");
}
- return "so";
+ return N("so");
case LANG_SORBIAN:
switch (sub)
{
/* Upper Sorbian */
- case 0x00: return "hsb";
- case SUBLANG_UPPER_SORBIAN_GERMANY: return "hsb_DE";
+ case 0x00: return N("hsb");
+ case SUBLANG_UPPER_SORBIAN_GERMANY: return N("hsb_DE");
/* Lower Sorbian */
- case 0x1f: return "dsb";
- case SUBLANG_LOWER_SORBIAN_GERMANY: return "dsb_DE";
+ case 0x1f: return N("dsb");
+ case SUBLANG_LOWER_SORBIAN_GERMANY: return N("dsb_DE");
}
- return "wen";
+ return N("wen");
case LANG_SOTHO:
/* <https://docs.microsoft.com/en-us/windows/desktop/Intl/language-identifier-constants-and-strings>
calls it "Sesotho sa Leboa"; according to
@@ -2303,240 +2308,241 @@ gl_locale_name_from_win32_LANGID (LANGID langid)
it's the same as Northern Sotho. */
switch (sub)
{
- case SUBLANG_SOTHO_SOUTH_AFRICA: return "nso_ZA";
+ case SUBLANG_SOTHO_SOUTH_AFRICA: return N("nso_ZA");
}
- return "nso";
+ return N("nso");
case LANG_SPANISH:
switch (sub)
{
- case SUBLANG_SPANISH: return "es_ES";
- case SUBLANG_SPANISH_MEXICAN: return "es_MX";
+ case SUBLANG_SPANISH: return N("es_ES");
+ case SUBLANG_SPANISH_MEXICAN: return N("es_MX");
case SUBLANG_SPANISH_MODERN:
- return "es_ES@modern"; /* not seen on Unix */
- case SUBLANG_SPANISH_GUATEMALA: return "es_GT";
- case SUBLANG_SPANISH_COSTA_RICA: return "es_CR";
- case SUBLANG_SPANISH_PANAMA: return "es_PA";
- case SUBLANG_SPANISH_DOMINICAN_REPUBLIC: return "es_DO";
- case SUBLANG_SPANISH_VENEZUELA: return "es_VE";
- case SUBLANG_SPANISH_COLOMBIA: return "es_CO";
- case SUBLANG_SPANISH_PERU: return "es_PE";
- case SUBLANG_SPANISH_ARGENTINA: return "es_AR";
- case SUBLANG_SPANISH_ECUADOR: return "es_EC";
- case SUBLANG_SPANISH_CHILE: return "es_CL";
- case SUBLANG_SPANISH_URUGUAY: return "es_UY";
- case SUBLANG_SPANISH_PARAGUAY: return "es_PY";
- case SUBLANG_SPANISH_BOLIVIA: return "es_BO";
- case SUBLANG_SPANISH_EL_SALVADOR: return "es_SV";
- case SUBLANG_SPANISH_HONDURAS: return "es_HN";
- case SUBLANG_SPANISH_NICARAGUA: return "es_NI";
- case SUBLANG_SPANISH_PUERTO_RICO: return "es_PR";
- case SUBLANG_SPANISH_US: return "es_US";
- }
- return "es";
+ return N("es_ES@modern"); /* not seen on Unix */
+ case SUBLANG_SPANISH_GUATEMALA: return N("es_GT");
+ case SUBLANG_SPANISH_COSTA_RICA: return N("es_CR");
+ case SUBLANG_SPANISH_PANAMA: return N("es_PA");
+ case SUBLANG_SPANISH_DOMINICAN_REPUBLIC: return N("es_DO");
+ case SUBLANG_SPANISH_VENEZUELA: return N("es_VE");
+ case SUBLANG_SPANISH_COLOMBIA: return N("es_CO");
+ case SUBLANG_SPANISH_PERU: return N("es_PE");
+ case SUBLANG_SPANISH_ARGENTINA: return N("es_AR");
+ case SUBLANG_SPANISH_ECUADOR: return N("es_EC");
+ case SUBLANG_SPANISH_CHILE: return N("es_CL");
+ case SUBLANG_SPANISH_URUGUAY: return N("es_UY");
+ case SUBLANG_SPANISH_PARAGUAY: return N("es_PY");
+ case SUBLANG_SPANISH_BOLIVIA: return N("es_BO");
+ case SUBLANG_SPANISH_EL_SALVADOR: return N("es_SV");
+ case SUBLANG_SPANISH_HONDURAS: return N("es_HN");
+ case SUBLANG_SPANISH_NICARAGUA: return N("es_NI");
+ case SUBLANG_SPANISH_PUERTO_RICO: return N("es_PR");
+ case SUBLANG_SPANISH_US: return N("es_US");
+ }
+ return N("es");
case LANG_SUTU:
switch (sub)
{
- case SUBLANG_DEFAULT: return "bnt_TZ"; /* or "st_LS" or "nso_ZA"? */
+ case SUBLANG_DEFAULT: return N("bnt_TZ"); /* or "st_LS" or "nso_ZA"? */
}
- return "bnt";
+ return N("bnt");
case LANG_SWAHILI:
switch (sub)
{
- case SUBLANG_SWAHILI_KENYA: return "sw_KE";
+ case SUBLANG_SWAHILI_KENYA: return N("sw_KE");
}
- return "sw";
+ return N("sw");
case LANG_SWEDISH:
switch (sub)
{
- case SUBLANG_SWEDISH_SWEDEN: return "sv_SE";
- case SUBLANG_SWEDISH_FINLAND: return "sv_FI";
+ case SUBLANG_SWEDISH_SWEDEN: return N("sv_SE");
+ case SUBLANG_SWEDISH_FINLAND: return N("sv_FI");
}
- return "sv";
+ return N("sv");
case LANG_SYRIAC:
switch (sub)
{
- case SUBLANG_SYRIAC_SYRIA: return "syr_SY"; /* An extinct language. */
+ case SUBLANG_SYRIAC_SYRIA: return N("syr_SY"); /* An extinct language. */
}
- return "syr";
+ return N("syr");
case LANG_TAGALOG:
switch (sub)
{
- case SUBLANG_TAGALOG_PHILIPPINES: return "tl_PH"; /* or "fil_PH"? */
+ case SUBLANG_TAGALOG_PHILIPPINES: return N("tl_PH"); /* or "fil_PH"? */
}
- return "tl"; /* or "fil"? */
+ return N("tl"); /* or "fil"? */
case LANG_TAJIK:
switch (sub)
{
- case 0x1f: return "tg";
- case SUBLANG_TAJIK_TAJIKISTAN: return "tg_TJ";
+ case 0x1f: return N("tg");
+ case SUBLANG_TAJIK_TAJIKISTAN: return N("tg_TJ");
}
- return "tg";
+ return N("tg");
case LANG_TAMAZIGHT:
/* Note: Microsoft uses the non-ISO language code "tmz". */
switch (sub)
{
- case SUBLANG_TAMAZIGHT_ARABIC: return "ber_MA";
- case 0x1f: return "ber@latin";
- case SUBLANG_TAMAZIGHT_ALGERIA_LATIN: return "ber_DZ";
+ case SUBLANG_TAMAZIGHT_ARABIC: return N("ber_MA");
+ case 0x1f: return N("ber@latin");
+ case SUBLANG_TAMAZIGHT_ALGERIA_LATIN: return N("ber_DZ");
}
- return "ber";
+ return N("ber");
case LANG_TAMIL:
switch (sub)
{
- case SUBLANG_TAMIL_INDIA: return "ta_IN";
+ case SUBLANG_TAMIL_INDIA: return N("ta_IN");
}
- return "ta"; /* Ambiguous: could be "ta_IN" or "ta_LK" or "ta_SG". */
+ return N("ta"); /* Ambiguous: could be "ta_IN" or "ta_LK" or "ta_SG". */
case LANG_TATAR:
switch (sub)
{
- case SUBLANG_TATAR_RUSSIA: return "tt_RU";
+ case SUBLANG_TATAR_RUSSIA: return N("tt_RU");
}
- return "tt";
+ return N("tt");
case LANG_TELUGU:
switch (sub)
{
- case SUBLANG_TELUGU_INDIA: return "te_IN";
+ case SUBLANG_TELUGU_INDIA: return N("te_IN");
}
- return "te";
+ return N("te");
case LANG_THAI:
switch (sub)
{
- case SUBLANG_THAI_THAILAND: return "th_TH";
+ case SUBLANG_THAI_THAILAND: return N("th_TH");
}
- return "th";
+ return N("th");
case LANG_TIBETAN:
switch (sub)
{
case SUBLANG_TIBETAN_PRC:
/* Most Tibetans would not like "bo_CN". But Tibet does not yet
have a country code of its own. */
- return "bo";
- case SUBLANG_TIBETAN_BHUTAN: return "bo_BT";
+ return N("bo");
+ case SUBLANG_TIBETAN_BHUTAN: return N("bo_BT");
}
- return "bo";
+ return N("bo");
case LANG_TIGRINYA:
switch (sub)
{
- case SUBLANG_TIGRINYA_ETHIOPIA: return "ti_ET";
- case SUBLANG_TIGRINYA_ERITREA: return "ti_ER";
+ case SUBLANG_TIGRINYA_ETHIOPIA: return N("ti_ET");
+ case SUBLANG_TIGRINYA_ERITREA: return N("ti_ER");
}
- return "ti";
+ return N("ti");
case LANG_TSONGA:
switch (sub)
{
- case SUBLANG_DEFAULT: return "ts_ZA";
+ case SUBLANG_DEFAULT: return N("ts_ZA");
}
- return "ts";
+ return N("ts");
case LANG_TSWANA:
/* Spoken in South Africa, Botswana. */
switch (sub)
{
- case SUBLANG_TSWANA_SOUTH_AFRICA: return "tn_ZA";
+ case SUBLANG_TSWANA_SOUTH_AFRICA: return N("tn_ZA");
}
- return "tn";
+ return N("tn");
case LANG_TURKISH:
switch (sub)
{
- case SUBLANG_TURKISH_TURKEY: return "tr_TR";
+ case SUBLANG_TURKISH_TURKEY: return N("tr_TR");
}
- return "tr";
+ return N("tr");
case LANG_TURKMEN:
switch (sub)
{
- case SUBLANG_TURKMEN_TURKMENISTAN: return "tk_TM";
+ case SUBLANG_TURKMEN_TURKMENISTAN: return N("tk_TM");
}
- return "tk";
+ return N("tk");
case LANG_UIGHUR:
switch (sub)
{
- case SUBLANG_UIGHUR_PRC: return "ug_CN";
+ case SUBLANG_UIGHUR_PRC: return N("ug_CN");
}
- return "ug";
+ return N("ug");
case LANG_UKRAINIAN:
switch (sub)
{
- case SUBLANG_UKRAINIAN_UKRAINE: return "uk_UA";
+ case SUBLANG_UKRAINIAN_UKRAINE: return N("uk_UA");
}
- return "uk";
+ return N("uk");
case LANG_URDU:
switch (sub)
{
- case SUBLANG_URDU_PAKISTAN: return "ur_PK";
- case SUBLANG_URDU_INDIA: return "ur_IN";
+ case SUBLANG_URDU_PAKISTAN: return N("ur_PK");
+ case SUBLANG_URDU_INDIA: return N("ur_IN");
}
- return "ur";
+ return N("ur");
case LANG_UZBEK:
switch (sub)
{
- case 0x1f: return "uz";
- case SUBLANG_UZBEK_LATIN: return "uz_UZ";
- case 0x1e: return "uz@cyrillic";
- case SUBLANG_UZBEK_CYRILLIC: return "uz_UZ@cyrillic";
+ case 0x1f: return N("uz");
+ case SUBLANG_UZBEK_LATIN: return N("uz_UZ");
+ case 0x1e: return N("uz@cyrillic");
+ case SUBLANG_UZBEK_CYRILLIC: return N("uz_UZ@cyrillic");
}
- return "uz";
+ return N("uz");
case LANG_VENDA:
switch (sub)
{
- case SUBLANG_DEFAULT: return "ve_ZA";
+ case SUBLANG_DEFAULT: return N("ve_ZA");
}
- return "ve";
+ return N("ve");
case LANG_VIETNAMESE:
switch (sub)
{
- case SUBLANG_VIETNAMESE_VIETNAM: return "vi_VN";
+ case SUBLANG_VIETNAMESE_VIETNAM: return N("vi_VN");
}
- return "vi";
+ return N("vi");
case LANG_WELSH:
switch (sub)
{
- case SUBLANG_WELSH_UNITED_KINGDOM: return "cy_GB";
+ case SUBLANG_WELSH_UNITED_KINGDOM: return N("cy_GB");
}
- return "cy";
+ return N("cy");
case LANG_WOLOF:
switch (sub)
{
- case SUBLANG_WOLOF_SENEGAL: return "wo_SN";
+ case SUBLANG_WOLOF_SENEGAL: return N("wo_SN");
}
- return "wo";
+ return N("wo");
case LANG_XHOSA:
switch (sub)
{
- case SUBLANG_XHOSA_SOUTH_AFRICA: return "xh_ZA";
+ case SUBLANG_XHOSA_SOUTH_AFRICA: return N("xh_ZA");
}
- return "xh";
+ return N("xh");
case LANG_YAKUT:
switch (sub)
{
- case SUBLANG_YAKUT_RUSSIA: return "sah_RU";
+ case SUBLANG_YAKUT_RUSSIA: return N("sah_RU");
}
- return "sah";
+ return N("sah");
case LANG_YI:
switch (sub)
{
- case SUBLANG_YI_PRC: return "ii_CN";
+ case SUBLANG_YI_PRC: return N("ii_CN");
}
- return "ii";
+ return N("ii");
case LANG_YIDDISH:
switch (sub)
{
- case SUBLANG_DEFAULT: return "yi_IL";
+ case SUBLANG_DEFAULT: return N("yi_IL");
}
- return "yi";
+ return N("yi");
case LANG_YORUBA:
switch (sub)
{
- case SUBLANG_YORUBA_NIGERIA: return "yo_NG";
+ case SUBLANG_YORUBA_NIGERIA: return N("yo_NG");
}
- return "yo";
+ return N("yo");
case LANG_ZULU:
switch (sub)
{
- case SUBLANG_ZULU_SOUTH_AFRICA: return "zu_ZA";
+ case SUBLANG_ZULU_SOUTH_AFRICA: return N("zu_ZA");
}
- return "zu";
- default: return "C";
+ return N("zu");
+ default: return N("C");
}
}
+ #undef N
}
# if !defined IN_LIBINTL
--
2.43.0
>From e63eea8ea358041610c3f9a9ed4d5a1e44be5cc4 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:57:02 +0100
Subject: [PATCH 4/7] localename tests: Test in the UTF-8 environment on native
Windows.
* tests/test-localename-w32utf8.sh: New file.
* tests/test-localename-w32utf8.c: New file.
* modules/localename-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-localename-w32utf8 and run
test-localename-w32utf8.sh.
---
ChangeLog | 10 +++++++
modules/localename-tests | 15 ++++++++++
tests/test-localename-w32utf8.c | 47 ++++++++++++++++++++++++++++++++
tests/test-localename-w32utf8.sh | 7 +++++
4 files changed, 79 insertions(+)
create mode 100644 tests/test-localename-w32utf8.c
create mode 100755 tests/test-localename-w32utf8.sh
diff --git a/ChangeLog b/ChangeLog
index d9f282c21e..fd3cf9f7ca 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@
2024-12-23 Bruno Haible <[email protected]>
+ localename tests: Test in the UTF-8 environment on native Windows.
+ * tests/test-localename-w32utf8.sh: New file.
+ * tests/test-localename-w32utf8.c: New file.
+ * modules/localename-tests (Files): Add these files and
+ m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
+ (Depends-on): Add test-xfail.
+ (configure.ac): Invoke gl_WINDOWS_RC.
+ (Makefile.am): Arrange to compile test-localename-w32utf8 and run
+ test-localename-w32utf8.sh.
+
localename-unsafe: Support the UTF-8 environment on native Windows.
* lib/localename-unsafe.c (gl_locale_name_from_win32_LANGID): Append a
suffix ".UTF-8" to the result if GetACP() is UTF-8.
diff --git a/modules/localename-tests b/modules/localename-tests
index 0c24d5b4b6..cf4d586806 100644
--- a/modules/localename-tests
+++ b/modules/localename-tests
@@ -1,7 +1,12 @@
Files:
tests/test-localename.c
+tests/test-localename-w32utf8.sh
+tests/test-localename-w32utf8.c
+tests/windows-utf8.rc
+tests/windows-utf8.manifest
tests/macros.h
m4/musl.m4
+m4/windows-rc.m4
Depends-on:
locale
@@ -9,13 +14,23 @@ setenv
unsetenv
setlocale
strdup
+test-xfail
configure.ac:
gl_CHECK_FUNCS_ANDROID([newlocale], [[#include <locale.h>]])
gl_MUSL_LIBC
+gl_WINDOWS_RC
Makefile.am:
TESTS += test-localename
check_PROGRAMS += test-localename
test_localename_LDADD = $(LDADD) $(SETLOCALE_LIB) @INTL_MACOSX_LIBS@ $(LIBTHREAD)
+if OS_IS_NATIVE_WINDOWS
+TESTS += test-localename-w32utf8.sh
+noinst_PROGRAMS += test-localename-w32utf8
+test_localename_w32utf8_LDADD = $(LDADD) test-localename-windows-utf8.res $(SETLOCALE_LIB)
+test-localename-windows-utf8.res : $(srcdir)/windows-utf8.rc
+ $(WINDRES) -i $(srcdir)/windows-utf8.rc -o test-localename-windows-utf8.res --output-format=coff
+MOSTLYCLEANFILES += test-localename-windows-utf8.res
+endif
diff --git a/tests/test-localename-w32utf8.c b/tests/test-localename-w32utf8.c
new file mode 100644
index 0000000000..72a01c0749
--- /dev/null
+++ b/tests/test-localename-w32utf8.c
@@ -0,0 +1,47 @@
+/* Test of gl_locale_name function and its variants
+ on native Windows in the UTF-8 environment.
+ Copyright (C) 2024 Free Software Foundation, Inc.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <https://www.gnu.org/licenses/>. */
+
+/* Written by Bruno Haible <[email protected]>, 2024. */
+
+#include <config.h>
+
+#include "localename.h"
+
+#include <stdio.h>
+#include <string.h>
+
+#include "macros.h"
+
+int
+main (void)
+{
+#ifdef _UCRT
+ const char *name = gl_locale_name_default ();
+
+ ASSERT (name != NULL);
+
+ /* With the legacy system settings, expect "C.UTF-8", not "C", because "C" is
+ a single-byte locale.
+ With the modern system settings, expect some "ll_CC.UTF-8" name. */
+ ASSERT (strlen (name) > 6 && strcmp (name + strlen (name)- 6, ".UTF-8") == 0);
+
+ return test_exit_status;
+#else
+ fputs ("Skipping test: not using the UCRT runtime\n", stderr);
+ return 77;
+#endif
+}
diff --git a/tests/test-localename-w32utf8.sh b/tests/test-localename-w32utf8.sh
new file mode 100755
index 0000000000..de7629c3a7
--- /dev/null
+++ b/tests/test-localename-w32utf8.sh
@@ -0,0 +1,7 @@
+#!/bin/sh
+
+# Test the UTF-8 environment on native Windows.
+unset LC_ALL
+unset LC_CTYPE
+unset LANG
+${CHECKER} ./test-localename-w32utf8${EXEEXT}
--
2.43.0
>From 00211fc69c926d6c8f6e3f3cf1d8802623db2af9 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:57:15 +0100
Subject: [PATCH 5/7] setlocale: Support the UTF-8 environment on native
Windows.
* lib/setlocale.c: Include <windows.h>.
(setlocale_unixlike): In the UTF-8 environment, append a suffix ".65001"
to the locale names passed to the native setlocale().
---
ChangeLog | 7 +++++++
lib/setlocale.c | 51 ++++++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 53 insertions(+), 5 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index fd3cf9f7ca..9f89cb8718 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-12-23 Bruno Haible <[email protected]>
+
+ setlocale: Support the UTF-8 environment on native Windows.
+ * lib/setlocale.c: Include <windows.h>.
+ (setlocale_unixlike): In the UTF-8 environment, append a suffix ".65001"
+ to the locale names passed to the native setlocale().
+
2024-12-23 Bruno Haible <[email protected]>
localename tests: Test in the UTF-8 environment on native Windows.
diff --git a/lib/setlocale.c b/lib/setlocale.c
index 62dce81de3..3cb711d8e1 100644
--- a/lib/setlocale.c
+++ b/lib/setlocale.c
@@ -47,6 +47,11 @@
extern void gl_locale_name_canonicalize (char *name);
#endif
+#if defined _WIN32 && !defined __CYGWIN__
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+#endif
+
#if 1
# undef setlocale
@@ -672,6 +677,7 @@ search (const struct table_entry *table, size_t table_size, const char *string,
static char *
setlocale_unixlike (int category, const char *locale)
{
+ int is_utf8 = (GetACP () == 65001);
char *result;
char llCC_buf[64];
char ll_buf[64];
@@ -682,6 +688,15 @@ setlocale_unixlike (int category, const char *locale)
if (locale != NULL && strcmp (locale, "POSIX") == 0)
locale = "C";
+ /* The native Windows implementation of setlocale, in the UTF-8 environment,
+ does not understand the locale names "C.UTF-8" or "C.utf8" or "C.65001",
+ but it understands "English_United States.65001", which is functionally
+ equivalent. */
+ if (locale != NULL
+ && ((is_utf8 && strcmp (locale, "C") == 0)
+ || strcmp (locale, "C.UTF-8") == 0))
+ locale = "English_United States.65001";
+
/* First, try setlocale with the original argument unchanged. */
result = setlocale_mtsafe (category, locale);
if (result != NULL)
@@ -714,7 +729,15 @@ setlocale_unixlike (int category, const char *locale)
*/
if (strcmp (llCC_buf, locale) != 0)
{
- result = setlocale (category, llCC_buf);
+ if (is_utf8)
+ {
+ char buf[64+6];
+ strcpy (buf, llCC_buf);
+ strcat (buf, ".65001");
+ result = setlocale (category, buf);
+ }
+ else
+ result = setlocale (category, llCC_buf);
if (result != NULL)
return result;
}
@@ -731,7 +754,15 @@ setlocale_unixlike (int category, const char *locale)
for (i = range.lo; i < range.hi; i++)
{
/* Try the replacement in language_table[i]. */
- result = setlocale (category, language_table[i].english);
+ if (is_utf8)
+ {
+ char buf[64+6];
+ strcpy (buf, language_table[i].english);
+ strcat (buf, ".65001");
+ result = setlocale (category, buf);
+ }
+ else
+ result = setlocale (category, language_table[i].english);
if (result != NULL)
return result;
}
@@ -785,13 +816,15 @@ setlocale_unixlike (int category, const char *locale)
size_t part1_len = strlen (part1);
const char *part2 = country_table[j].english;
size_t part2_len = strlen (part2) + 1;
- char buf[64+64];
+ char buf[64+64+6];
if (!(part1_len + 1 + part2_len <= sizeof (buf)))
abort ();
memcpy (buf, part1, part1_len);
buf[part1_len] = '_';
memcpy (buf + part1_len + 1, part2, part2_len);
+ if (is_utf8)
+ strcat (buf, ".65001");
/* Try the concatenated replacements. */
result = setlocale (category, buf);
@@ -809,8 +842,16 @@ setlocale_unixlike (int category, const char *locale)
for (i = language_range.lo; i < language_range.hi; i++)
{
/* Try only the language replacement. */
- result =
- setlocale (category, language_table[i].english);
+ if (is_utf8)
+ {
+ char buf[64+6];
+ strcpy (buf, language_table[i].english);
+ strcat (buf, ".65001");
+ result = setlocale (category, buf);
+ }
+ else
+ result =
+ setlocale (category, language_table[i].english);
if (result != NULL)
return result;
}
--
2.43.0
>From 2f4391fde8620749fb3859c568f952a958e2ca2c Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:58:53 +0100
Subject: [PATCH 6/7] setlocale tests: Test in the UTF-8 environment on native
Windows.
* tests/test-setlocale-w32utf8.sh: New file.
* tests/test-setlocale-w32utf8.c: New file.
* modules/setlocale-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-setlocale-w32utf8 and run
test-setlocale-w32utf8.sh.
---
ChangeLog | 10 +++++
modules/setlocale-tests | 16 ++++++++
tests/test-setlocale-w32utf8.c | 69 +++++++++++++++++++++++++++++++++
tests/test-setlocale-w32utf8.sh | 12 ++++++
4 files changed, 107 insertions(+)
create mode 100644 tests/test-setlocale-w32utf8.c
create mode 100755 tests/test-setlocale-w32utf8.sh
diff --git a/ChangeLog b/ChangeLog
index 9f89cb8718..c5e2e8b1b2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@
2024-12-23 Bruno Haible <[email protected]>
+ setlocale tests: Test in the UTF-8 environment on native Windows.
+ * tests/test-setlocale-w32utf8.sh: New file.
+ * tests/test-setlocale-w32utf8.c: New file.
+ * modules/setlocale-tests (Files): Add these files and
+ m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
+ (Depends-on): Add test-xfail.
+ (configure.ac): Invoke gl_WINDOWS_RC.
+ (Makefile.am): Arrange to compile test-setlocale-w32utf8 and run
+ test-setlocale-w32utf8.sh.
+
setlocale: Support the UTF-8 environment on native Windows.
* lib/setlocale.c: Include <windows.h>.
(setlocale_unixlike): In the UTF-8 environment, append a suffix ".65001"
diff --git a/modules/setlocale-tests b/modules/setlocale-tests
index ad0a536bc6..23cc6ddd17 100644
--- a/modules/setlocale-tests
+++ b/modules/setlocale-tests
@@ -4,21 +4,28 @@ tests/test-setlocale1.c
tests/test-setlocale2.sh
tests/test-setlocale2.c
tests/test-setlocale-w32.c
+tests/test-setlocale-w32utf8.sh
+tests/test-setlocale-w32utf8.c
+tests/windows-utf8.rc
+tests/windows-utf8.manifest
tests/signature.h
tests/macros.h
m4/locale-fr.m4
m4/locale-ja.m4
m4/locale-zh.m4
m4/codeset.m4
+m4/windows-rc.m4
Depends-on:
strdup
+test-xfail
configure.ac:
gt_LOCALE_FR
gt_LOCALE_FR_UTF8
gt_LOCALE_JA
gt_LOCALE_ZH_CN
+gl_WINDOWS_RC
Makefile.am:
TESTS += test-setlocale1.sh test-setlocale2.sh test-setlocale-w32
@@ -31,3 +38,12 @@ check_PROGRAMS += test-setlocale1 test-setlocale2 test-setlocale-w32
test_setlocale1_LDADD = $(LDADD) @SETLOCALE_LIB@
test_setlocale2_LDADD = $(LDADD) @SETLOCALE_LIB@
test_setlocale_w32_LDADD = $(LDADD) @SETLOCALE_LIB@
+
+if OS_IS_NATIVE_WINDOWS
+TESTS += test-setlocale-w32utf8.sh
+noinst_PROGRAMS += test-setlocale-w32utf8
+test_setlocale_w32utf8_LDADD = $(LDADD) test-setlocale-windows-utf8.res $(SETLOCALE_LIB)
+test-setlocale-windows-utf8.res : $(srcdir)/windows-utf8.rc
+ $(WINDRES) -i $(srcdir)/windows-utf8.rc -o test-setlocale-windows-utf8.res --output-format=coff
+MOSTLYCLEANFILES += test-setlocale-windows-utf8.res
+endif
diff --git a/tests/test-setlocale-w32utf8.c b/tests/test-setlocale-w32utf8.c
new file mode 100644
index 0000000000..f0bbce05b7
--- /dev/null
+++ b/tests/test-setlocale-w32utf8.c
@@ -0,0 +1,69 @@
+/* Test of setting the current locale
+ on native Windows in the UTF-8 environment.
+ Copyright (C) 2024 Free Software Foundation, Inc.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <https://www.gnu.org/licenses/>. */
+
+/* Written by Bruno Haible <[email protected]>, 2024. */
+
+#include <config.h>
+
+#include <locale.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main (void)
+{
+#ifdef _UCRT
+ /* Test that setlocale() works as expected in a UTF-8 locale. */
+ char *name;
+
+ /* This looks at all LC_*, LANG environment variables, which are all unset
+ at this point. */
+ if (setlocale (LC_ALL, "") == NULL)
+ return 1;
+
+ name = setlocale (LC_ALL, NULL);
+ /* With the legacy system settings, expect some mixed locale, due to the
+ limitations of the native setlocale().
+ With the modern system settings, expect some "ll_CC.UTF-8" name. */
+ if (!((strlen (name) > 6 && strcmp (name + strlen (name) - 6, ".UTF-8") == 0)
+ || strcmp (name, "LC_COLLATE=English_United States.65001;"
+ "LC_CTYPE=English_United States.65001;"
+ "LC_MONETARY=English_United States.65001;"
+ "LC_NUMERIC=English_United States.65001;"
+ "LC_TIME=English_United States.65001;"
+ "LC_MESSAGES=C.UTF-8")
+ == 0
+ || strcmp (name, "LC_COLLATE=English_United States.utf8;"
+ "LC_CTYPE=English_United States.utf8;"
+ "LC_MONETARY=English_United States.utf8;"
+ "LC_NUMERIC=English_United States.utf8;"
+ "LC_TIME=English_United States.utf8;"
+ "LC_MESSAGES=C.UTF-8")
+ == 0))
+ {
+ fprintf (stderr, "setlocale() returned \"%s\".\n", name);
+ exit (1);
+ }
+
+ return 0;
+#else
+ fputs ("Skipping test: not using the UCRT runtime\n", stderr);
+ return 77;
+#endif
+}
diff --git a/tests/test-setlocale-w32utf8.sh b/tests/test-setlocale-w32utf8.sh
new file mode 100755
index 0000000000..e8f7484cf0
--- /dev/null
+++ b/tests/test-setlocale-w32utf8.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+# Test the UTF-8 environment on native Windows.
+unset LC_ALL
+unset LC_CTYPE
+unset LC_MESSAGES
+unset LC_NUMERIC
+unset LC_COLLATE
+unset LC_MONETARY
+unset LC_TIME
+unset LANG
+${CHECKER} ./test-setlocale-w32utf8${EXEEXT}
--
2.43.0
From c11a2e675ccc8637e6322b98d878b0315a8bb7e6 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Mon, 23 Dec 2024 16:59:20 +0100
Subject: [PATCH 7/7] mbrtowc tests: Test in the UTF-8 environment on native
Windows.
* tests/test-mbrtowc-w32utf8.sh: New file.
* tests/test-mbrtowc-w32utf8.c: New file.
* modules/mbrtowc-tests (Files): Add these files and
m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
(Depends-on): Add test-xfail.
(configure.ac): Invoke gl_WINDOWS_RC.
(Makefile.am): Arrange to compile test-mbrtowc-w32utf8 and run
test-mbrtowc-w32utf8.sh.
---
ChangeLog | 12 +++
modules/mbrtowc-tests | 16 ++++
tests/test-mbrtowc-w32utf8.c | 166 ++++++++++++++++++++++++++++++++++
tests/test-mbrtowc-w32utf8.sh | 12 +++
4 files changed, 206 insertions(+)
create mode 100644 tests/test-mbrtowc-w32utf8.c
create mode 100755 tests/test-mbrtowc-w32utf8.sh
diff --git a/ChangeLog b/ChangeLog
index c5e2e8b1b2..e6d2e1d592 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-12-23 Bruno Haible <[email protected]>
+
+ mbrtowc tests: Test in the UTF-8 environment on native Windows.
+ * tests/test-mbrtowc-w32utf8.sh: New file.
+ * tests/test-mbrtowc-w32utf8.c: New file.
+ * modules/mbrtowc-tests (Files): Add these files and
+ m4/windows-rc.m4, tests/windows-utf8.rc, tests/windows-utf8.manifest.
+ (Depends-on): Add test-xfail.
+ (configure.ac): Invoke gl_WINDOWS_RC.
+ (Makefile.am): Arrange to compile test-mbrtowc-w32utf8 and run
+ test-mbrtowc-w32utf8.sh.
+
2024-12-23 Bruno Haible <[email protected]>
setlocale tests: Test in the UTF-8 environment on native Windows.
diff --git a/modules/mbrtowc-tests b/modules/mbrtowc-tests
index d152e2e472..d9add89fee 100644
--- a/modules/mbrtowc-tests
+++ b/modules/mbrtowc-tests
@@ -13,6 +13,10 @@ tests/test-mbrtowc-w32-6.sh
tests/test-mbrtowc-w32-7.sh
tests/test-mbrtowc-w32-8.sh
tests/test-mbrtowc-w32.c
+tests/test-mbrtowc-w32utf8.sh
+tests/test-mbrtowc-w32utf8.c
+tests/windows-utf8.rc
+tests/windows-utf8.manifest
tests/signature.h
tests/macros.h
m4/locale-en.m4
@@ -20,12 +24,14 @@ m4/locale-fr.m4
m4/locale-ja.m4
m4/locale-zh.m4
m4/codeset.m4
+m4/windows-rc.m4
Depends-on:
mbsinit
wctob
setlocale
localcharset
+test-xfail
configure.ac:
gt_LOCALE_EN_UTF8
@@ -33,6 +39,7 @@ gt_LOCALE_FR
gt_LOCALE_FR_UTF8
gt_LOCALE_JA
gt_LOCALE_ZH_CN
+gl_WINDOWS_RC
Makefile.am:
TESTS += \
@@ -49,3 +56,12 @@ TESTS_ENVIRONMENT += \
LOCALE_ZH_CN='@LOCALE_ZH_CN@'
check_PROGRAMS += test-mbrtowc test-mbrtowc-w32
test_mbrtowc_LDADD = $(LDADD) $(SETLOCALE_LIB) $(MBRTOWC_LIB)
+
+if OS_IS_NATIVE_WINDOWS
+TESTS += test-mbrtowc-w32utf8.sh
+noinst_PROGRAMS += test-mbrtowc-w32utf8
+test_mbrtowc_w32utf8_LDADD = $(LDADD) test-mbrtowc-windows-utf8.res $(SETLOCALE_LIB)
+test-mbrtowc-windows-utf8.res : $(srcdir)/windows-utf8.rc
+ $(WINDRES) -i $(srcdir)/windows-utf8.rc -o test-mbrtowc-windows-utf8.res --output-format=coff
+MOSTLYCLEANFILES += test-mbrtowc-windows-utf8.res
+endif
diff --git a/tests/test-mbrtowc-w32utf8.c b/tests/test-mbrtowc-w32utf8.c
new file mode 100644
index 0000000000..803c1638c0
--- /dev/null
+++ b/tests/test-mbrtowc-w32utf8.c
@@ -0,0 +1,166 @@
+/* Test of conversion of multibyte character to wide character
+ on native Windows in the UTF-8 environment.
+ Copyright (C) 2024 Free Software Foundation, Inc.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <https://www.gnu.org/licenses/>. */
+
+/* Written by Bruno Haible <[email protected]>, 2024. */
+
+#include <config.h>
+
+#include <wchar.h>
+
+#include <errno.h>
+#include <locale.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "macros.h"
+
+int
+main (void)
+{
+#ifdef _UCRT
+ /* Test that MB_CUR_MAX and mbrtowc() work as expected in a UTF-8 locale. */
+ mbstate_t state;
+ wchar_t wc;
+ size_t ret;
+
+ if (setlocale (LC_ALL, "") == NULL)
+ return 1;
+
+ ASSERT (MB_CUR_MAX >= 4);
+
+ {
+ char input[] = "B\303\274\303\237er"; /* "B????er" */
+ memset (&state, '\0', sizeof (mbstate_t));
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input, 1, &state);
+ ASSERT (ret == 1);
+ ASSERT (wc == 'B');
+ ASSERT (mbsinit (&state));
+ input[0] = '\0';
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input + 1, 1, &state);
+ ASSERT (ret == (size_t)(-2));
+ ASSERT (wc == (wchar_t) 0xBADFACE);
+ ASSERT (!mbsinit (&state));
+ input[1] = '\0';
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input + 2, 5, &state);
+ ASSERT (ret == 1);
+ ASSERT (wctob (wc) == EOF);
+ ASSERT (wc == 0x00FC);
+ ASSERT (mbsinit (&state));
+ input[2] = '\0';
+
+ /* Test support of NULL first argument. */
+ ret = mbrtowc (NULL, input + 3, 4, &state);
+ ASSERT (ret == 2);
+ ASSERT (mbsinit (&state));
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input + 3, 4, &state);
+ ASSERT (ret == 2);
+ ASSERT (wctob (wc) == EOF);
+ ASSERT (wc == 0x00DF);
+ ASSERT (mbsinit (&state));
+ input[3] = '\0';
+ input[4] = '\0';
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input + 5, 2, &state);
+ ASSERT (ret == 1);
+ ASSERT (wc == 'e');
+ ASSERT (mbsinit (&state));
+ input[5] = '\0';
+
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, input + 6, 1, &state);
+ ASSERT (ret == 1);
+ ASSERT (wc == 'r');
+ ASSERT (mbsinit (&state));
+
+ /* Test some invalid input. */
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\377", 1, &state); /* 0xFF */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\303\300", 2, &state); /* 0xC3 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\343\300", 2, &state); /* 0xE3 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\343\300\200", 3, &state); /* 0xE3 0xC0 0x80 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\343\200\300", 3, &state); /* 0xE3 0x80 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\363\300", 2, &state); /* 0xF3 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\363\300\200\200", 4, &state); /* 0xF3 0xC0 0x80 0x80 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\363\200\300", 3, &state); /* 0xF3 0x80 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\363\200\300\200", 4, &state); /* 0xF3 0x80 0xC0 0x80 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+
+ memset (&state, '\0', sizeof (mbstate_t));
+ wc = (wchar_t) 0xBADFACE;
+ ret = mbrtowc (&wc, "\363\200\200\300", 4, &state); /* 0xF3 0x80 0x80 0xC0 */
+ ASSERT (ret == (size_t)-1);
+ ASSERT (errno == EILSEQ);
+ }
+
+ return test_exit_status;
+#else
+ fputs ("Skipping test: not using the UCRT runtime\n", stderr);
+ return 77;
+#endif
+}
diff --git a/tests/test-mbrtowc-w32utf8.sh b/tests/test-mbrtowc-w32utf8.sh
new file mode 100755
index 0000000000..d0a953486c
--- /dev/null
+++ b/tests/test-mbrtowc-w32utf8.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+# Test the UTF-8 environment on native Windows.
+unset LC_ALL
+unset LC_CTYPE
+unset LC_MESSAGES
+unset LC_NUMERIC
+unset LC_COLLATE
+unset LC_MONETARY
+unset LC_TIME
+unset LANG
+${CHECKER} ./test-mbrtowc-w32utf8${EXEEXT}
--
2.43.0