On Wednesday, 18 May 2022 05:29:41 PDT Alvin Wong wrote: > I am considering enabling UTF-8 as the activeCodePage ^ on Windows > (supported on Windows Version 1903 and beyond) [1] for Krita to > improve our situation with using Unicode file paths when interacting > with external C/C++ libraries. As I have not found any existing > discussions on this topic, I am now investigating how Qt (5.12 in our > case) would be affected under this configuration. > > I suspect that, since QString uses UTF-16 and Qt should already be > using the -W version of Windows API, it should for the most part not > affect the operations of Qt. As far as I know, the only component that > would be affected is the system QTextCodec (qwindowscodec.cpp), which > is also used by QString::fromLocal8Bit and QString::toLocal8Bit. > Because it uses WideCharToMultiByte and MultiByteToWideChar with > CP_ACP, when activeCodePage set to UTF-8, CP_ACP now uses UTF-8 > instead of the system ACP (e.g. Windows-1252, Big5, Shift JIS, ...)
Hello Alvin Qt uses almost exclusively the W versions of the Win32 API. There are a couple of cases of A use, but those are the exception and you don't have to worry about them. Those and the "System" codec would be affected by your switch to a different codepage, but that's your intention anyway and I don't see a problem. > In theory it should just work, but when reviewing qwindowscodec.cpp I > noticed code [2] that seems like it assumes the MBCS has only two > bytes maximum per character, which is not true for UTF-8 (in which a > Unicode code point can be composed by up to 4 UTF-8 code units.) The > same code exists in Qt 6, just moved to a different location [3]. As I > am not familiar with how QTextCodec work, I cannot quite tell if this > is a real issue or not. Can anyone here give some advice? I think you're right and that code definitely looks fishy. And it looks like we lost the tst_Utf8 test in the QStringConverter change, particularly tst_Utf8::charByChar, which might have caught this. Looks like tst_QStringConverter does not attempt to test that the system codec very well either, because we can't statically know what it can do. tst_Utf8 had a detection to see if the system codec happened to be UTF-8. Since you're still on 5.12, tst_Utf8 (qtbase/tests/auto/codecs/utf8) is there. Can you try to run on with the UTF-8 codepage and see if it passes? > I would also like to ask if Qt will officially support using UTF-8 as > the ACP on Windows. As far as I know, it already does. The Vietnamese locale for Windows has been using UTF-8 for years (probably since forever) and there's no reason that Qt shouldn't support it. Whether there are bugs or not is a different story, of course. -- Thiago Macieira - thiago.macieira (AT) intel.com Cloud Software Architect - Intel DCAI Cloud Engineering _______________________________________________ Interest mailing list Interest@qt-project.org https://lists.qt-project.org/listinfo/interest