On 15/08/2010, Dennis Williamson <dennistwilliam...@gmail.com> wrote: > It only consumes two bytes on my system (or one if it's followed by > another escape or a closing quote).
You are wrong. Try "echo $'\x{123456}AB'" and look at the result. Or read the source code: lib/sh/strtans.c > "Backslash-escaped characters" refers to the "c" in "\c" not the > characters that follow it. Given that documentation doesn't say anything like that anywhere, and given that _every other escape_ operates on characters (accepting only ASCII chars, and leaving multibyte ones alone) - inventing an exception specifically for "\c" would look quite contrived. > It's the responsibility of your code to put an ASCII character after > the \c. My code is fine, thank you. ;-) Given that I never had any use for "\c" when there is "\x". Instead I found this weirdness in the Bash source code when writing my own function for interpreting (some of) shell syntax. > There's no way for Bash to guess that the 0xD0 is part of a > Unicode character or the byte that it is. Everything between 0x80 and 0xFF is part of (possibly invalid) multibyte sequence in UTF-8. Read up on the UTF-8 encoding, and don't make wrong guesses again. -- -= With best regards, Dmitry Groshev =-