Hi, I have data that is either floats or byte strings in utf-8. I need to cast both to unicode strings. I am probably missing something simple, but.. in the code below, under "float", why does [B] throw an error but [A] does not?
# Python 2.7.3 (default, Feb 27 2014, 19:39:10) [GCC 4.7.2] on linux2 >>> help(unicode) Help on class unicode in module __builtin__: class unicode(basestring) | unicode(string [, encoding[, errors]]) -> object | | Create a new Unicode object from the given encoded string. | encoding defaults to the **current default string encoding**. | errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'. # ... >>> import sys >>> sys.getdefaultencoding() 'ascii' # float: cannot explicitly give encoding, even if it's the default >>> value = 1.0 >>> unicode(value) # [A] u'1.0' >>> unicode(value, sys.getdefaultencoding()) # [B] Traceback (most recent call last): File "<pyshell#22>", line 1, in <module> unicode(value, sys.getdefaultencoding()) TypeError: coercing to Unicode: need string or buffer, float found >>> unicode(value, "utf-8") # (... also TypeError) # byte string: must explicitly give encoding (which makes perfect sense) >>> value = '\xc3\xa9' >>> unicode(value) Traceback (most recent call last): File "<pyshell#31>", line 1, in <module> unicode(value) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) >>> unicode(value, "utf-8") u'\xe9' Thank you! Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor