Re: Unicode questions

2010-10-26 Thread Steve Holden
On 10/26/2010 12:32 PM, John Nagle wrote: > On 10/19/2010 12:02 PM, Tobiah wrote: >> I've been reading about the Unicode today. >> I'm only vaguely understanding what it is >> and how it works. >> >> Please correct my understanding where it is lacking. > > http://justfuckinggoogleit.com/ Neit

Re: Unicode questions

2010-10-26 Thread John Nagle
On 10/19/2010 12:02 PM, Tobiah wrote: I've been reading about the Unicode today. I'm only vaguely understanding what it is and how it works. Please correct my understanding where it is lacking. http://justfuckinggoogleit.com/ -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode questions

2010-10-25 Thread Steve Holden
On 10/25/2010 2:36 PM, Terry Reedy wrote: > On 10/25/2010 2:33 AM, Steve Holden wrote: >> On 10/25/2010 1:42 AM, Lawrence D'Oliveiro wrote: >>> In message, Petite >>> Abeille wrote: >>> Characters vs. Bytes >>> >>> And why do certain people insist on referring to bytes as “octets”? >> >> Becau

Re: Unicode questions

2010-10-25 Thread Terry Reedy
On 10/25/2010 2:33 AM, Steve Holden wrote: On 10/25/2010 1:42 AM, Lawrence D'Oliveiro wrote: In message, Petite Abeille wrote: Characters vs. Bytes And why do certain people insist on referring to bytes as “octets”? Because back in the old days bytes were of varying sizes on different arch

Re: Unicode questions

2010-10-25 Thread Seebs
On 2010-10-25, Lawrence D'Oliveiro wrote: > In message , Petite > Abeille wrote: >> Characters vs. Bytes > And why do certain people insist on referring to bytes as ???octets One common reason is that there have been machines on which "bytes" were not 8 bits. In particular, the usage of "b

Re: Unicode questions

2010-10-24 Thread Steve Holden
On 10/25/2010 1:42 AM, Lawrence D'Oliveiro wrote: > In message , Petite > Abeille wrote: > >> Characters vs. Bytes > > And why do certain people insist on referring to bytes as “octets”? Because back in the old days bytes were of varying sizes on different architectures - indeed the DECSystem-1

Re: Unicode questions

2010-10-24 Thread Chris Rebert
On Sun, Oct 24, 2010 at 10:43 PM, Lawrence D'Oliveiro wrote: > In message , Chris Rebert > wrote: > >> There is no such thing as "plain Unicode representation". > > UCS-4 or UTF-16 probably come the closest. How do you figure that? Cheers, Chris -- http://mail.python.org/mailman/listinfo/python

Re: Unicode questions

2010-10-24 Thread Lawrence D'Oliveiro
In message , Chris Rebert wrote: > There is no such thing as "plain Unicode representation". UCS-4 or UTF-16 probably come the closest. -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode questions

2010-10-24 Thread Lawrence D'Oliveiro
In message , Petite Abeille wrote: > Characters vs. Bytes And why do certain people insist on referring to bytes as “octets”? -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode questions

2010-10-21 Thread OdarR
On Oct 19, 9:02 pm, Tobiah wrote: > I've been reading about the Unicode today. > I'm only vaguely understanding what it is > and how it works. > ... > Thanks, > > Tobiah Hi, A good advice, read this presentation, http://farmdev.com/talks/unicode/ Explanation and advices for coding. Olivier --

Re: Unicode questions

2010-10-20 Thread M.-A. Lemburg
Tobiah wrote: > I've been reading about the Unicode today. > I'm only vaguely understanding what it is > and how it works. > > Please correct my understanding where it is lacking. > Unicode is really just a database of character information > such as the name, unicode section, possible > numeric

Re: Unicode questions

2010-10-19 Thread Terry Reedy
On 10/19/2010 4:31 PM, Tobiah wrote: There is no such thing as "plain Unicode representation". The closest thing would be an abstract sequence of Unicode codepoints (ala Python's `unicode` type), but this is way too abstract to be used for sharing/interchange, because storing anything in a file o

Re: Unicode questions

2010-10-19 Thread Chris Rebert
On Tue, Oct 19, 2010 at 1:31 PM, Tobiah wrote: >> There is no such thing as "plain Unicode representation". The closest >> thing would be an abstract sequence of Unicode codepoints (ala Python's >> `unicode` type), but this is way too abstract to be used for >> sharing/interchange, because storing

Re: Unicode questions

2010-10-19 Thread Petite Abeille
On Oct 19, 2010, at 10:31 PM, Tobiah wrote: > So why so many encoding schemes? http://en.wikipedia.org/wiki/Space-time_tradeoff -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode questions

2010-10-19 Thread Tobiah
> There is no such thing as "plain Unicode representation". The closest > thing would be an abstract sequence of Unicode codepoints (ala Python's > `unicode` type), but this is way too abstract to be used for > sharing/interchange, because storing anything in a file or sending it > over a network u

Re: Unicode questions

2010-10-19 Thread Chris Rebert
On Tue, Oct 19, 2010 at 12:02 PM, Tobiah wrote: > I've been reading about the Unicode today. > I'm only vaguely understanding what it is > and how it works. Petite Abeille already pointed to Joel's excellent primer on the subject; I can only second their endorsement of his article. > Please corr

Re: Unicode questions

2010-10-19 Thread Hrvoje Niksic
Tobiah writes: > would be shared? Why can't we just say "unicode is unicode" > and just share files the way ASCII users do. Just have a huge > ASCII style table that everyone sticks to. I'm not sure that I understand you correctly, but UCS-2 and UCS-4 encodings are that kind of thing. Many pe

Re: Unicode questions

2010-10-19 Thread Petite Abeille
On Oct 19, 2010, at 9:02 PM, Tobiah wrote: > Please enlighten my vague and probably ill-formed conception of this whole > thing. Hmmm... is there a question hidden somewhere in there or is it more open ended in nature? :) In the meantime... The Absolute Minimum Every Software Developer Absol