On Feb 2 21:12, Dennis Heimbigner wrote: > I am using 64bit. > And it has nothing to do misreading characters. > > The ^X is described in this document: > https://www.cygwin.com/cygwin-ug-net/using-specialnames.html, > > There you will see this text: > > "If you don't want or can't use UTF-8 as character set for > whatever reason, you will nevertheless be able to access the > file. How does that work? When Cygwin converts the filename from > UTF-16 to your character set, it recognizes characters which > can't be converted. If that occurs, Cygwin replaces the > non-convertible character with a special character sequence. The > sequence starts with an ASCII CAN character (hex code 0x18, > equivalent Control-X), followed by the UTF-8 representation of > the character. The result is a filename containing some ugly > looking characters. While it doesn't look nice, it is nice, > because Cygwin knows how to convert this filename back to > UTF-16. The filename will be converted using your usual > character set. However, when Cygwin recognizes an ASCII CAN > character, it skips over the ASCII CAN and handles the following > bytes as a UTF-8 character. Thus, the filename is symmetrically > converted back to UTF-16 and you can access the file." > > There is no obvious good reason to continue this convention.
You're probably using a non-UTF-8 locale, e. g., LANG=en_US using ISO-8859-1 as charset. See the output of `locale -av' to learn what charset your locale uses. Either way, converting the UTF-16 filenames to a non-UTF charset is not lossless. That's what the ASCII CAN stuff is for. If you want to avoid that, use a UTF-8 locale, e.g. en_US.UTF-8. Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple