在 2025-7-3 00:04, Kirill Makurin 写道:
More information about this issue for anyone who's interested.

It seems that return value of mbrtowc is broken only with DBCS code pages. The 
return value is correct when using UCRT's mbrtowc with UTF-8 locales.

The issue with mbrtowc has direct effect on mbsrtowcs:

```
char *str = /* some DBCS string */;
wchar_t buffer[SUFFICIENT_BUFFER_SIZE];
mbstate_t state = {0);
mbrtowc (buffer, str, 1, &state);
str += 1;
mbsrtowcs (buffer, &str, SUFFICIENT_BUFFER_SIZE, &state)
```

Call to `mbrtowc` will return (size_t)-2 and update &state. Call to `mbsrtowcs` 
will correctly convert the first character in str, however, since `mbrtowc` returns 
2, it skips two bytes in str instead of one as it should. The result is either 
incorrectly converted string or conversion failure, depending on what byte is 
pointed to by str+2.

This is not specific to UCRT. It looks like legacy CRT behavior; all MSVCR* 
DLLs are also affected.

I have no idea about how widely locales are used on Windows. At least we (me) don't; we use points as decimal separators as in English, and nobody prefers locales to Windows APIs.

If we decide to provide replacements for broken Microsoft ones, which then nobody will use, there's probably little worth in doing that.



--
Best regards,
LIU Hao

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to