> > > > $ echo $LC_ALL > > > > en_US > > > > > > Hang on, where did that come from?
It was in my environment. My apologies for being dense. > > I unset LC_ALL and... > > Where? I unset LC_ALL in bash, which was the wrong place. > > Now ls foo<tab> adds the actual accented character to > > the command line, but when I press return I get: > > > > ls: cannot access foo<a gray box>: No such file or directory And of course this works now. Sorry for the trouble. > > I still get the right answer from test -f, when using > > the shell builtin. /usr/bin/test tells me the file > > doesn't exist. > > .. and that. As does this, as long as I use the same encoding I used to originally create the file which is totally fine. > > > The \x18 scheme is only used for codepoints that can > > > not be represented in the selected character set, yet > > > U+00E9 can be represented CP1252. By definition, any > > > Unicode codepoint can be represented in UTF-8, so the > > > \x18 scheme is never used when that is selected. > > > > > > To enable C-style backslash interpretation, you need > > > to use $'...' quoting. > > > > I now see the bash man page explains this. Must have > > missed it the first time. The above paragraphs with > > some examples (where \x18 is needed and where it isn't) > > added to > > http://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-unusual > > would have gotten me farther before posting. > > But what I said is explained there already: I suppose, but the point about \x18 not working with a character set that represents the desired codepoint wasn't clear. Nor was the bash syntax for using \x in general. It's in the bash man page and not cygwin-specific, but an example showing the gory details would have helped me at least. > > And finally here are the steps that illustrate what's going on. > > > > $ touch $'\x18'; echo $? > > 0 > > > > ls shows a file named up-arrow (0x18): > > What do you mean by up-arrow? I'm getting a question mark, because > that's what ls prints for non-printable characters by default. You can > choose various quoting styles using the --quoting style option. I mean the uparrow that ls prints with --show-control-chars. Another important omission on my part. Doh! > Yep, but that's a bash vs ls issue rather than a Cygwin > one. You'd get the same on Linux. But if you use control > characters in filenames, you better know what you're doing > anyway. Some argue that it shouldn't be allowed in the > first place, e.g. > http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html Thanks for the link. I don't typically use control characters in filename. Just an example. > > $ mkshortcut -n shortcut$'\xC3\xA9' plain; echo $? > > $ readshortcut shortcut$'\xE9' > > I'm afraid these aren't yet Unicode-ready, i.e. they still use Windows > "ANSI" APIs. Guess it's time to roll up my sleeves and write a patch. -DB -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple