Amaury Forgeot d'Arc added the comment:
The string '\xe7\x8e\xb0' is the utf-8 encoded version of u'现' (=u'\u73b0')
But your Windows system uses the cp936 code page to encode file names.
'\xe7\x8e\xb0' is invalid in this code page: the last character is an
incomplete multibyte sequence, and is dropped by Windows when converting to a
Unicode file name.
Windows automatic conversion functions work similar to this Python code (note
the 'ignore' parameter):
>>> '\xe7\x8e\xb0'.decode('cp936', 'ignore').encode('cp936')
'\xe7\x8e'
'\xe7\x8e\xb0' is an invalid file name on your platform. You should either:
- use cp936 encoding in your application
- much better, use unicode file names everywhere:
>>> os.path.abspath('\xe7\x8e\xb0'.decode('utf-8'))
will return the expected result.
Python3 will emit a Warning when os.path.abspath() is called with a bytes
string.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue17320>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com