[Tutor] questions on encoding

Albert-Jan Roskam Wed, 20 Jul 2011 11:52:51 -0700

Hi,

I am looking for test data with accented and multibyte characters. I have found 
a good resource that I could use to cobble something together 
(http://www.inter-locale.com/whitepaper/learn/learn-to-test.html) but I was 
hoping somebody knows some ready resource.


I also have some questions about encoding. In the code below, is there a 
difference between unicode() and .decode?
s = "§ÇÇ¼ÍÍ"
x = unicode(s, "utf-8")
y = s.decode("utf-8")
x == y # returns True

Also, is it, at least theoretically, possible to mix different encodings in 
byte strings? I'd say no, unless there are multiple BOMs or so. Not that I'd 
like to try this, but it'd improve my understanding of this sort of obscure 
topic.


Cheers!!

Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

[Tutor] questions on encoding

Reply via email to