On Wed, Feb 16, 2011 at 01:01:07AM +0100, Vincent Lefevre wrote:
> On 2011-02-14 16:43:11 +0000, Ian Jackson wrote:
> > When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode
> > characters to stdout should use UTF-8.  That's what LC_TYPE means.
> 
> So, "cat", "grep", etc. are all broken. :)

How come?

"cat" will, for any valid UTF-8 character on input, print a valid UTF-8
character on output.  For any valid ISO-8859-1 character on input, it will
print a valid ISO-8859-1 character on output.  

"grep" on the other hand has to actually understand the encoding -- and it
does.  Try this:
$ echo "ą"|LC_CTYPE=C grep --color=always .
Will be mangled.
$ echo "ą"|LC_CTYPE=en_US.utf-8 grep --color=always .
Will be handled correctly.

-- 
1KB             // Microsoft corollary to Hanlon's razor:
                //      Never attribute to stupidity what can be
                //      adequately explained by malice.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110216003451.ga14...@angband.pl

Reply via email to