[Tutor] Is there a package to "un-mangle" characters?

Albert-Jan Roskam Thu, 21 Nov 2013 12:29:19 -0800

Hi,

Today I had a csv file in utf-8 encoding, but part of the accented characters 
were mangled. The data were scraped from a website and it turned out that at 
least some of the data were mangled on the website already. Bits of the text 
were actually cp1252 (or cp850), I think, even though the webpage was in utf-8 
Is there any package that helps to correct such issues? (I tried looking for 
one but it doesn't really help that there is such a thing as name mangling! ;-) 
This comes pretty close though: https://gist.github.com/litchfield/1282752


Thanks in advance!

Regards,

Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a 

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] Is there a package to "un-mangle" characters?

Reply via email to