On Tue, Jul 8, 2008 at 5:19 PM, Robert Johansson <[EMAIL PROTECTED]> wrote: > Hi, I'm puzzled by the character encodings which I get when I use Python > with IDLE. The string '\xf6' represents a letter in the Swedish alphabet > when coded with utf8. On our computer with MacOSX this gets coded as > '\xc3\xb6' which is a string of length 2. I have configured IDLE to encode > utf8 but it doesn't make any difference.
I think you may be a bit confused about utf-8. '\xf6' is not a utf-8 character. U00F6 is the Unicode (not utf-8) codepoint for LATIN SMALL LETTER O WITH DIAERESIS. '\xf6' is also the Latin-1 encoding of this character. The utf-8 encoding of this character is the two-byte sequence '\xc3\xb6'. Can you give some more specific details about what you do and what you see? Also you might want to do some background reading on Unicode; this is a good place to start: http://www.joelonsoftware.com/articles/Unicode.html Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor