Initially it was a simple patch to make `btowc` and `wctob` match UCRT behavior. If do serious changes to `btowc` and `wctob`, I think we should also take a look at `mb*towc*` and `wc*tomb*` functions provided by mingw-w64.
I do not say and I do not think that we should replace `mb*towc*` and `wc*tomb*` functions for UCRT. What we can do is make sure that provided replacements match CRT's behavior (e.g. use lossy conversion and follow this strange "C" locale behavior). At this point it would be easier to implement both `btowc` and `wctob` in terms of `mbrtowc` and `wcrtomb` respectively. I suggest we start a new discussion in a new thread. I have some other details regarding CRT's locale support since I am currently working on code which implements POSIX locale functions on top of Win32 and CRT. - Kirill Makurin ________________________________ From: LIU Hao Sent: Saturday, June 14, 2025 8:55 PM To: Kirill Makurin; mingw-w64-public Subject: Re: [Mingw-w64-public] Inconsistent behavior of btowc with "C" locale 在 2025-6-8 00:21, Kirill Makurin 写道: > I guess sticking to range [0,255] is our best choice. > > I attached patches. > Mostly these look good to me. However I get errors from libc++ testsuite: https://github.com/lhmouse/mingw-w64/actions/runs/15650737822/job/44095645474#step:7:13365 which failed at this, which can by producedby installing mingw-w64 CRT with the first patch and compiling the testcase with `clang++ -static`: std::locale l; typedef std::ctype_byname<wchar_t> F; std::locale ll(l, new F("C")); const F& f = std::use_facet<F>(ll); assert(f.widen(char(-5)) == L'\u00fb'); And here's backtrace: #0 0x00007ff657205139 in btowc (c=-5) at misc/btowc.c:16 #1 0x00007ff6571fcd61 in std::__1::__locale::__btowc(int, std::__1::__locale::__locale_t) () #2 0x00007ff6571dda9a in std::__1::ctype_byname<wchar_t>::do_widen(char) const () #3 0x00007ff6571b19ac in std::__1::ctype<wchar_t>::widen[abi:ne200100](char) const (this=0x5b9c40, __c=-5 '\373') at C:/MSYS64/clang64/include/c++/v1/__locale:490 #4 0x00007ff6571b1884 in main () at test.cc:37 Here we can see the parameter `c` of type `int` is a sign-extension of the argument, so I think this if (cp == 0) return (unsigned) c <= 0xFF ? c : WEOF; is being skeptical. What if we blindly truncate `c`, just like the code beneath it: if (cp == 0) return (unsigned char) c; -- Best regards, LIU Hao _______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public