Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Vincent Lefevre
On 2011-02-16 01:34:51 +0100, Adam Borowski wrote: > On Wed, Feb 16, 2011 at 01:01:07AM +0100, Vincent Lefevre wrote: > > On 2011-02-14 16:43:11 +, Ian Jackson wrote: > > > When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode > > > characters to stdout should use UTF-8. That's wh

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Adam Borowski
On Wed, Feb 16, 2011 at 01:01:07AM +0100, Vincent Lefevre wrote: > On 2011-02-14 16:43:11 +, Ian Jackson wrote: > > When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode > > characters to stdout should use UTF-8. That's what LC_TYPE means. > > So, "cat", "grep", etc. are all brok

Re: OT: Python

2011-02-15 Thread Vincent Lefevre
On 2011-02-14 13:11:04 -0800, Russ Allbery wrote: > Perl is specifically documented to not do this for backward compatibility > reasons. In Perl, which is the one I know best, you are required to > decode input and encode output if you want to have UTF-8 handling. Or better, use the -C option. p

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Vincent Lefevre
On 2011-02-14 16:43:11 +, Ian Jackson wrote: > When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode > characters to stdout should use UTF-8. That's what LC_TYPE means. So, "cat", "grep", etc. are all broken. :) -- Vincent Lefèvre - Web: 100% accessibl

Re: OT: Python

2011-02-14 Thread Russ Allbery
Ian Jackson writes: > Klaus Ethgen writes: >> No, it is not. 00a3 is just not a utf-8 character, it is unicode. To >> get a correct utf-8 character you need to print \x{c2a3} and then >> isutf8 is happy. > When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode > characters to stdout

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Konstantin Khomoutov
On Mon, 14 Feb 2011 16:43:11 + Ian Jackson wrote: > Klaus Ethgen writes ("Re: OT: Python (was: Make Unicode bugs release > critical?)"): > > No, it is not. 00a3 is just not a utf-8 character, it is unicode. > > To get a correct utf-8 character you need to print

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Ian Jackson
Klaus Ethgen writes ("Re: OT: Python (was: Make Unicode bugs release critical?)"): > No, it is not. 00a3 is just not a utf-8 character, it is unicode. To get > a correct utf-8 character you need to print \x{c2a3} and then isutf8 is > happy. When LC_CTYPE=en_GB.utf-8, progra

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Am Mo den 14. Feb 2011 um 16:24 schrieb Ian Jackson: > Jakub Wilk writes ("Re: OT: Python (was: Make Unicode bugs release > critical?)"): > > * Klaus Ethgen , 2011-02-14, 14:37: > > >~> LC_CTYPE=en_G

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Ian Jackson
Jakub Wilk writes ("Re: OT: Python (was: Make Unicode bugs release critical?)"): > * Klaus Ethgen , 2011-02-14, 14:37: > >~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' > >~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n&qu

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Adam Borowski
On Mon, Feb 14, 2011 at 02:02:11PM +, Philipp Kern wrote: > On 2011-02-14, Klaus Ethgen wrote: > > ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' > > ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' | cat > > Both gives the same result, a '£' sign as expected. > > And what's the v

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Am Mo den 14. Feb 2011 um 15:15 schrieb Lars Wirzenius: > On ma, 2011-02-14 at 14:37 +0100, Klaus Ethgen wrote: > > lets start a python rant. I love to hate this language. :-) > > Let's not. 'Till here it is personal desire. > Let's not rant about

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Jakub Wilk
* Klaus Ethgen , 2011-02-14, 14:37: ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' | cat Let me try... $ LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' | isutf8 stdin: line 1, char 1, byte offset 1: invalid UTF-8 code But I don

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Lars Wirzenius
On ma, 2011-02-14 at 14:37 +0100, Klaus Ethgen wrote: > lets start a python rant. I love to hate this language. :-) Let's not. Let's not rant about any languages, or tools, or desktop environments. Let's be constructive on Debian mailing lists, shall we? We have plenty of side-channels for rants

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Philipp Kern
On 2011-02-14, Klaus Ethgen wrote: > ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' > ~> LC_CTYPE=en_GB.utf-8 perl -e 'print "\x{00a3}\n";' | cat > Both gives the same result, a '£' sign as expected. And what's the value in that demonstration? Yes, you can treat UTF8 like a bytestream. A