On Sun, 2017-06-25 at 12:23 -0400, Chet Ramey wrote:
> On 6/24/17 1:41 PM, Eduardo A. Bustamante López wrote:
> 
> > 
> >   dualbus@debian:~$ LANG=zh_CN.GBK printf '\u4e57' | od -tx1 -An
> >    81 5c
> > 
> > It looks like it doesn't detect that \x81\x5c is a single character, and
> > instead treats the multibyte character as separate characters.
> It's apparently not a single character in that locale.
> 
Yes it is!
https://en.wikipedia.org/wiki/GBK
\x81 \x5C is a two-byte character from level GBK/3.
But unless I've misunderstood something, it seems to be behaving correctly 
already. At least, with the exception of within $'..' quotes.

Reply via email to