在 2025-01-17 00:38, Lasse Collin 写道:
NAME_MAX is a POSIX constant; Windows doesn't define it. For this reason, assume that NAME_MAX is only about filenames in multibyte representation. In the UTF-8 code page, filenames can be up to 255 * 3 bytes excluding the terminating null character. (It's not 255 * 4 because four-byte UTF-8 characters consume two UTF-16 code units.)If I have understood correctly, there is no Windows locale that supports a code page with longer encodings. For example, a single UTF-16 code unit may produce four bytes in GB18030 but it cannot be used as a locale code page.
It seems so.Although it is possible to pass GB 18030 (code page 54936) to `WideCharToMultiByte()`, Windows in Simplified Chinese uses GBK by default, which itself is a extension to GB/T 2312-1980 and reuses its identifier (code page 936).
While GBK includes many traditional Chinese and Japanese characters, it does not seem to support four-byte characters in GB 18030:
UCRT64 ~/Desktop/t $ cat find_files.c #define WIN32_LEAN_AND_MEAN 1 #include <windows.h> #include <stdio.h> int main(void) { printf("active code page = %d\n", GetACP()); WIN32_FIND_DATAA file; HANDLE h = FindFirstFileA("*", &file); if(h != INVALID_HANDLE_VALUE) { do { int n = lstrlenA(file.cFileName); printf("found '%s': %d byte(s):", file.cFileName, n); for(int i = 0; i < n; ++i) printf(" %.2hhx", file.cFileName[i]); printf("\n"); } while(FindNextFileA(h, &file)); FindClose(h); } } UCRT64 ~/Desktop/t $ touch $'\uFFFF' # four bytes in GB 18030: 84 31 a4 39 UCRT64 ~/Desktop/t $ touch '测试文件' UCRT64 ~/Desktop/t $ touch '測試檔案' UCRT64 ~/Desktop/t $ touch 'テストファイル' UCRT64 ~/Desktop/t $ ls -l total 1 -rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 ''$'\357\277\277' -rw-r--r-- 1 lh_mouse lh_mouse 542 Jan 18 15:23 find_files.c -rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 テストファイル -rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 测试文件 -rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 測試檔案 UCRT64 ~/Desktop/t $ gcc find_files.c -o find_files.exe -Wall -Wextra UCRT64 ~/Desktop/t $ ./find_files.exe active code page = 936 found '.': 1 byte(s): 2e found '..': 2 byte(s): 2e 2e found 'find_files.c': 12 byte(s): 66 69 6e 64 5f 66 69 6c 65 73 2e 63 found 'find_files.exe': 14 byte(s): 66 69 6e 64 5f 66 69 6c 65 73 2e 65 78 65 found 'テストファイル': 14 byte(s): a5 c6 a5 b9 a5 c8 a5 d5 a5 a1 a5 a4 a5 eb found '测试文件': 8 byte(s): b2 e2 ca d4 ce c4 bc fe found '測試檔案': 8 byte(s): 9c 79 d4 87 99 6e b0 b8 found '?': 1 byte(s): 3f -- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public