On Mon, May 02, 2011 at 08:41:23AM -0400, Greg Wooledge wrote: > On Sun, May 01, 2011 at 09:17:49PM -0500, Jonathan Nieder wrote: > > Hi, > > > > ri...@inf.ufpr.br wrote: > > > > > When running "echo [A-Z]*" , it shows all files/dirs of current > > > directory, not only those starting with capital letters. I tried > > > different locales such as: POSIX, C, en_US, pt_BR > > > > > > Repeat-By: > > > $ mkdir a && cd a > > > $ touch a b c; mkdir D E F > > > $ echo [A-Z]* > > > b c D E F > > > $ echo [a-z]* > > > a b c D E F > > > > See http://bugs.debian.org/301717 (???fnmatch("[a-z]", ...) matches > > capital letters in most locales???) for some details. > > See also http://mywiki.wooledge.org/locale
Thanks for the explanations now I understand what is happening. > > > I'm puzzled by your comment on trying different locales, though: > > I tried > > > > mkdir a && cd a > > touch a b c; mkdir D E F > > echo [A-Z]* > > > > and got output > > > > b c D E F > > > > as expected. Then I tried > > > > LANG=C > > export LANG > > echo [A-Z]* > > > > and got output > > > > D E F > > > > Does your experience differ? I'm using 4.1.5(1)-release fwiw. > > Presumably, "ribas" did not correctly set the locale variables during > his or her testing. Indeed, I did not export the variable just ran like LANG=C echo [A-Z]*, exporting works. > > > > No Fix yet, looking on the source code. > > There's nothing to fix. This is in the realm of a new feature request. > > > In the long run, a good fix might be to teach fnmatch a new > > FNM_STRICTCASE flag and optionally use it. > > If by "strict case" you mean "force POSIX locale" or "force US-ASCII > ordering", then the option ought to be called something less confusing. > > > The hardest part would > > seem to be making tables so the system can know what "this range, > > using the same case" means. > > It already knows this, because it's what the POSIX (C) locale does. > > > A separate aspect is documentation. I imagine Chet wouldn't mind > > a patch to bash.1 and bash.info to explain this pitfall under > > "Pattern Matching" or even under "BUGS" (aka LIMITATIONS). > > This is not a bug, so it does not belong in BUGS. > > The first place I found in the man page that makes mention of this is > the Pathname Expansion section. This, I agree, should be changed. > Perhaps this would an acceptable wording: > > > --- doc/bash.1.orig Mon May 2 08:31:26 2011 > +++ doc/bash.1 Mon May 2 08:35:51 2011 > @@ -3121,8 +3121,8 @@ > If one of these characters appears, then the word is > regarded as a > .IR pattern , > -and replaced with an alphabetically sorted list of > -file names matching the pattern. > +and replaced with a list of file names matching the pattern, > +sorted alphabetically by the current locale's collating sequence. > If no matching file names are found, > and the shell option > .B nullglob > > > Under Pattern Matching, there is already an explanation of how it uses the > LC_COLLATE variable, the current locale, etc. It's all there. In fact, > since Pattern Matching is a subsection of Pathname Expansion, one could > argue that my patch is redundant, but since the pathname expansion stuff > appears first, someone may stop reading before encountering the more > verbose description, so IMHO it doesn't hurt to correct the introduction. -- Bruno Ribas - ri...@inf.ufpr.br http://www.inf.ufpr.br/ribas