Wolfgang Thaller <[EMAIL PROTECTED]> writes: > In what way is ISO-2022 non-reversible? Is it possible that a ISO-2022 > file name that is converted to Unicode cannot be converted back any > more (assuming you know for sure that it was ISO-2022 in the first > place)?
I am no expert on ISO-2022 so the following may contain errors, please correct if it is wrong. ISO-2022 -> Unicode is always possible. Also Unicode -> ISO-2022 should be always possible, but is a relation not a function. This means there are an infinite? ways of encoding a particular unicode string in ISO-2022. ISO-2022 works by providing escape sequences to switch between different character sets. One can freely use these escapes in almost any way you wish. Also ISO-2022 makes a difference between the same character in japanese/chinese/korean - which unicode does not do. See here for more info on the topic: http://www.ecma-international.org/publications/files/ecma-st/ECMA-035.pdf Also trusting system locale for everything is problematic and makes things quite unbearable for I18N. e.g. on my desktop 95% of things run with iso-8859-1, 3% of things use utf-8 and a few apps use EUC-JP... Using filenames as opaque blobs causes the least problems. If the program wishes to display them in a graphical environment then they have to be converted to a string, but very many apps never display the filenames... - Einar Karttunen _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
