在 2025-7-3 00:04, Kirill Makurin 写道:
More information about this issue for anyone who's interested.It seems that return value of mbrtowc is broken only with DBCS code pages. The return value is correct when using UCRT's mbrtowc with UTF-8 locales. The issue with mbrtowc has direct effect on mbsrtowcs: ``` char *str = /* some DBCS string */; wchar_t buffer[SUFFICIENT_BUFFER_SIZE]; mbstate_t state = {0); mbrtowc (buffer, str, 1, &state); str += 1; mbsrtowcs (buffer, &str, SUFFICIENT_BUFFER_SIZE, &state) ``` Call to `mbrtowc` will return (size_t)-2 and update &state. Call to `mbsrtowcs` will correctly convert the first character in str, however, since `mbrtowc` returns 2, it skips two bytes in str instead of one as it should. The result is either incorrectly converted string or conversion failure, depending on what byte is pointed to by str+2.
This is not specific to UCRT. It looks like legacy CRT behavior; all MSVCR* DLLs are also affected.I have no idea about how widely locales are used on Windows. At least we (me) don't; we use points as decimal separators as in English, and nobody prefers locales to Windows APIs.
If we decide to provide replacements for broken Microsoft ones, which then nobody will use, there's probably little worth in doing that.
-- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public