On 2024-12-23 Bruno Haible wrote: > Lasse Collin reported in > <https://lists.gnu.org/archive/html/bug-gettext/2024-12/msg00111.html> > that the setlocale() override from GNU libintl does not support the > UTF-8 environment of native Windows correctly. That setlocale() > override is based on the setlocale() override from gnulib. So let me > add that support here.
Thanks! I looked at the commits but I didn't test anything yet. (1) In 9f7ff4f423cd ("localename-unsafe: Support the UTF-8 environment on native Windows."), the N(name) macro is used with strings that include @modifier. For example, N("az_AZ@cyrillic") can expand to "az...@cyrillic.utf-8". Similarly in 00211fc69c92 ("setlocale: Support the UTF-8 environment on native Windows."), ".65001" is appended after the @modifier. However, the typical order would be az_AZ.UTF-8@cyrillic. I suppose you had a reason to use .65001 instead of .UTF-8 or .utf8. I expect identical behavior from those. The MS setlocale() docs use variants of .UTF8: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support (2) In 2f4391fde862 ("setlocale tests: Test in the UTF-8 environment on native Windows."), the condition (strlen (name) > 6 && strcmp (name + strlen (name) - 6, ".UTF-8") == 0) matches the two long strings below it too, making those two extra strcmp calls redundant. (3) When a manifest is added via a resource file, a possible default manifest from the toolchain is replaced; they aren't merged. For example, on MSYS2, the mingw-w64-ucrt-x86_64-gcc package depends on mingw-w64-ucrt-x86_64-windows-default-manifest. The manifest comes from Cygwin: https://sourceware.org/git/?p=cygwin-apps/windows-default-manifest.git;a=blob;f=default-manifest.rc Omitting the <compatibility> section makes the application run with Vista as the Operating System Context. Omitting the <trustInfo> section makes Windows treat the application as not UAC compliant, that is, a pre-Vista app that needs compatibility tricks. Probably these don't matter with the current tests. I suggest changing it still because it's still an odd combination to have UTF-8 without marking the app compatible with recent Windows versions. (4) The output from windres goes to a file with the .res suffix but the format is overridden with --output-format=coff. This looks weird because windres defaults to --output-format=res for files that use the .res suffix. For coff, the .o suffix would be logical, and --output-format option wouldn't be needed. See the paragraphs near the beginning of the info node (binutils)windres. A simple command should be enough: windres input.rc output.o > In fact, there are apparently two variants of this mode: > - the legacy Windows settings variant: when you haven't ever > (or recently?) changed the system default locale of Windows 10, > - the modern Windows settings variant: when you have changed > the system default locale of Windows 10. > With the legacy Windows settings, the setlocale() function produces > locale names such as "English_United States.65001" or > "English_United States.utf8". With the modern Windows settings, it > produces "en_US.UTF-8" instead. (This is with both mingw and MSVC, > according to my testing.) I don't know enough about Windows to comment much. I only tested on one Win10 system which returned the long spellings. If native setlocale(LC_ALL, "") can indeed result in "en_US" or "en_US.UTF-8", I wonder if it can result in "az-Cyrl_AZ.UTF-8" too. I don't see how Gnulib or Gettext would map such a locale name to az_AZ.UTF-8@cyrillic. (az_AZ@cyrillic was the first one with @ in localename-unsafe.c, thus I looked at that in MS docs too.) The codeset seems to be a part of the language name: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/a9eac961-e77d-41a6-90a5-ce1a8b0cdb9c Locale format doesn't use @modifier: https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-170 -- Lasse Collin