On Thu, Nov 21, 2013 at 12:04:19PM -0800, Albert-Jan Roskam wrote:
> Hi,
> 
> Today I had a csv file in utf-8 encoding, but part of the accented 
> characters were mangled. The data were scraped from a website and it 
> turned out that at least some of the data were mangled on the website 
> already. Bits of the text were actually cp1252 (or cp850), I think, 
> even though the webpage was in utf-8 Is there any package that helps 
> to correct such issues?

Python has superpowers :-)

http://blog.luminoso.com/2012/08/20/fix-unicode-mistakes-with-python/



-- 
Steven
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to