Paul Eggert wrote: > However, my worry is that good support for non-ASCII-safe encodings like > Shift-JIS is hard to do, and that any such support we'd add to > Gnulib/coreutils/etc. would not only increase maintenance costs and > reduce runtime performance
Shift_JIS is not the only non-ASCII-safe encoding; GB18030, BIG5, BIG5-HKSCS, and GBK are as well, and among these GB18030 is used as locale encoding in China. Therefore it is important for programs to support these locale encodings. Gnulib has the support for it: - It has replacement functions that operate correctly with these locale encodings: strstr, c_strstr -> mbsstr strchr -> mbschr strrchr -> mbsrchr strspn -> mbsspn strcspn -> mbscspn strpbrk -> mbspbrk strsep -> mbssep strtok_r -> mbstok_r - It has warnings (through _GL_WARN_ON_USE) for uses of the functions that are not OK for non-ASCII-safe encodings. - It has modules mbchar, mbiter, mbfile for iterating through the multibyte characters of a string or file, that work for all locale encodings. Yes, it does reduce the performance to use these safer functions. I have shown in the past, through coreutils patches, how to accommodate both a "fast path" and a "safe path" in the same binary. Bruno