>kent37 at tds.net wrote: >>mjekl at iol.pt wrote: >> Hi, >> >> >> My interpreter in set via sitecustomize.py to use utf-8 as default encoding. >> >> I'm reading fields from a dbf table to a firebird db with encoding set to >> win1252. >> I guess it's original encoding is cp850, but am not sure, and have been >> addressing exceptions one by one with lines of: >> >> r = r.replace(u'offending_code', u'ok_code') >> >Why don't you just convert from cp850 to cp1252 directly? Python >supports both encodings, it's as simple as >some_string.decode('cp850').encode('cp1252')
In the mean while somewhat accidently (read some stuff) I followed a similar approach. [...] ORIGINAL POST DELETED HERE [...] >My guess is a coding error on your part, otherwise something would have >changed...can you show some context in import_pcfcli.py? I also expect it's my error ;-( ;-) The following snippet of my present code isn't giving me any problems. Although I'm not really sure why it works. Also I had some problems encoding to 'cp1252' and not to 'utf-8'. Anyone as a pointer to a nice resource that can help me understand this decode / encode biz better? try: # TODO: Check the str.translate() method r = recordSet.Fields(fieldsDict[f]).Value.strip() r.decode('cp850') r = r.replace(u'\x8f', u'') r = r.replace(u'\u20ac', u'\xc7') r = r.replace(u'\xa6', u'\xaa') r = r.replace(u'\u2122', u'\xd5') r = r.replace(u'\u017d', u'\xc3') r = r.replace(u'\xa7', u'\xba') # The following line does not work if with 'cp1252' !? rec.append(r.encode('utf-8')) # kinterbasdb makes conversions by itself ;-) except UnicodeDecodeError, UnicodeEncodeError: print f return None Txs, Miguel _______________________________________________________________________________________ Uma mensalidade a medida da sua carteira. Saber mais em http://www.iol.pt/correio/rodape.php?dst=0607191 _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor