On Thu, Jun 18, 2009 at 4:37 PM, Serdar Tumgoren<zstumgo...@gmail.com> wrote:
> On the above link, the section on "Encoding Unicode Byte Streams" has > the following example: > >>>> u = u"abc\u2013" >>>> print u > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in > position 3: ordinal not in range(128) >>>> print u.encode("utf-8") > abc– > > But when I try the same example on my Windows XP machine (with Python > 2.5.4), I can't get the same results. Instead, it spits out the below > (hopefully it renders properly and we don't have encoding issues!!!): > > $ python > Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] > on > win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> x = u"abc\u2013" >>>> print x > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "C:\Program Files\Python25\lib\encodings\cp437.py", line 12, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013' in > position > 3: character maps to <undefined> >>>> x.encode("utf-8") > 'abc\xe2\x80\x93' >>>> print x.encode("utf-8") > abcΓÇô The example is written assuming the console encoding is utf-8. Yours seems to be cp437. Try this: C:\Project\Mango> py Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. In [1]: import sys In [2]: sys.stdout.encoding Out[2]: 'cp437' But there is another problem - \u2013 is an em dash which does not appear in cp437, so even giving the correct encoding doesn't work. Try this: In [6]: x = u"abc\u2591" In [7]: print x.encode('cp437') ------> print(x.encode('cp437')) abc░ > In a related test, I was unable change the default character encoding > for the python interpreter from ascii to utf-8. In all cases (cygwin, > Wing IDE, windows command line), the interpreter reported that I my > "sys" module does not contain the "setdefaultencoding" method (even > though this should be part of the module from versions 2.x and above). sys.defaultencoding is deleted by site.py on python startup.You have to set the default encoding from within a sitecustomize.py module. But it's usually better to get a correct understanding of what is going on and to leave the default encoding alone. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor