Hi,
 
I am trying to write a file with a 'foreign' unicode name (I am aware that this 
is a highly western-o-centric way of putting it). In Linux, I can encode it to 
utf-8 and the file name is displayed correctly. In windows xp, the characters 
can, apparently, not be represented in this encoding called 'mbcs'. How can I 
write file names that are always encoded correctly on any platform? Or is this 
a shortcoming of Windows?
 
# Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] 
on win32
import sys

 
def _encodeFileName(fn):
    """Helper function to encode unicode file names into system file names.
    http://effbot.org/pyref/sys.getfilesystemencoding.htm""";
    isWindows = sys.platform.startswith("win")
    isUnicode = isinstance(fn, unicode)
    if isUnicode:  # and not isWindows
        encoding = sys.getfilesystemencoding()  # 'mbcs' on Windows, 'utf-8' on 
Linux
        encoding = "utf-8" if not encoding else encoding
        return fn.encode(encoding)
    return fn
 
fn = u'\u0c0f\u0c2e\u0c02\u0c21\u0c40' + '.txt'   # Telugu language
with open(_encodeFileName(fn), "wb") as w:
    w.write("yaay!\n")   # the characters of the FILE NAME can not be 
represented in the encoding (squares/tofu)
    print "written: ", w.name
 
Thank you very much in advance!

Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a 
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to