Martin v. Löwis wrote: >> Since the serialization of the Unicode string is likely to use UTF-8, >> and the string for such a file will include half surrogates, the >> application may raise an exception when encoding the names for a >> configuration file. These encoding exceptions will be as rare as the >> unusual names (which the careful I18N aware developer has probably >> eradicated from his system), and thus will appear late. > > There are trade-offs to any solution; if there was a solution without > trade-offs, it would be implemented already. > > The Python UTF-8 codec will happily encode half-surrogates; people argue > that it is a bug that it does so, however, it would help in this > specific case.
Can we use this encoding scheme for writing into files as well? We've turned the filename with undecodable bytes into a string with half surrogates. Putting that string into a file has to turn them into bytes at some level. Can we use the python-escape error handler to achieve that somehow? -Toshio
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com