On Thu, Nov 21, 2013 at 12:04:19PM -0800, Albert-Jan Roskam wrote: > Hi, > > Today I had a csv file in utf-8 encoding, but part of the accented > characters were mangled. The data were scraped from a website and it > turned out that at least some of the data were mangled on the website > already. Bits of the text were actually cp1252 (or cp850), I think, > even though the webpage was in utf-8 Is there any package that helps > to correct such issues?
Python has superpowers :-) http://blog.luminoso.com/2012/08/20/fix-unicode-mistakes-with-python/ -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor