Re: \c escape within $'...' can produce mangled UTF-8

Dmitry Groshev Sat, 14 Aug 2010 14:08:35 -0700

On 15/08/2010, Chet Ramey <chet.ra...@case.edu> wrote:
> I'm not sure why you think this is a bug.


Because the documentation says "backslash-escaped _characters_", and
not "bytes"? ;-)

> The \c escape is described
> as converting to a control character; control characters are always a
> single byte; the conversion to a control character therefore consumes
> one byte.

This leap of illogic is beyond my ken. As a counterexample, "\x{...}"
escape can consume an unlimited number of bytes while producing a
single byte.

> It's not the business of $'...' conversion to ensure that
> the result is a valid multibyte character string.

Is its business to produce invalid UTF when given a nonsense escape,
then? And is the rest of code quite prepared to deal with invalid
multibyte chars springing into existence at this point?
If an escape's parameter makes no sense, escape sequence should be
left untranslated - just the way "\x" handles things like "\xZZ". Make
"\c" check that its parameter is an ASCII char, and the problem will
be fixed.
Unless for some reason you consider this bug worth preserving. :-)


-- 
-= With best regards, Dmitry Groshev =-

Re: \c escape within $'...' can produce mangled UTF-8

Reply via email to