On Sun, 2017-06-25 at 12:23 -0400, Chet Ramey wrote:
> On 6/24/17 1:41 PM, Eduardo A. Bustamante López wrote:
>
> >
> > dualbus@debian:~$ LANG=zh_CN.GBK printf '\u4e57' | od -tx1 -An
> > 81 5c
> >
> > It looks like it doesn't detect that \x81\x5c is a single character, and
> > instead treats the multibyte character as separate characters.
> It's apparently not a single character in that locale.
>
Yes it is!
https://en.wikipedia.org/wiki/GBK
\x81 \x5C is a two-byte character from level GBK/3.
But unless I've misunderstood something, it seems to be behaving correctly
already. At least, with the exception of within $'..' quotes.