On Mon, May 2, 2011 at 9:23 AM, Bruno Cesar Ribas <ri...@inf.ufpr.br> wrote:
> On Mon, May 02, 2011 at 08:41:23AM -0400, Greg Wooledge wrote: > > On Sun, May 01, 2011 at 09:17:49PM -0500, Jonathan Nieder wrote: > > > Hi, > > > > > > ri...@inf.ufpr.br wrote: > > > > > > > When running "echo [A-Z]*" , it shows all files/dirs of current > > > > directory, not only those starting with capital letters. I tried > > > > different locales such as: POSIX, C, en_US, pt_BR > > > > > > > > Repeat-By: > > > > $ mkdir a && cd a > > > > $ touch a b c; mkdir D E F > > > > $ echo [A-Z]* > > > > b c D E F > > > > $ echo [a-z]* > > > > a b c D E F > > > > > > See http://bugs.debian.org/301717 (???fnmatch("[a-z]", ...) matches > > > capital letters in most locales???) for some details. > > > > See also http://mywiki.wooledge.org/locale > > Thanks for the explanations now I understand what is happening. > > > > > > I'm puzzled by your comment on trying different locales, though: > > > I tried > > > > > > mkdir a && cd a > > > touch a b c; mkdir D E F > > > echo [A-Z]* > > > > > > and got output > > > > > > b c D E F > > > > > > as expected. Then I tried > > > > > > LANG=C > > > export LANG > > > echo [A-Z]* > > > > > > and got output > > > > > > D E F > > > > > > Does your experience differ? I'm using 4.1.5(1)-release fwiw. > > > > Presumably, "ribas" did not correctly set the locale variables during > > his or her testing. > > Indeed, I did not export the variable just ran like LANG=C echo [A-Z]*, > exporting works. > > > > > > > No Fix yet, looking on the source code. > > > > There's nothing to fix. This is in the realm of a new feature request. > > > > > In the long run, a good fix might be to teach fnmatch a new > > > FNM_STRICTCASE flag and optionally use it. > > > > If by "strict case" you mean "force POSIX locale" or "force US-ASCII > > ordering", then the option ought to be called something less confusing. > > > > > The hardest part would > > > seem to be making tables so the system can know what "this range, > > > using the same case" means. > > > > It already knows this, because it's what the POSIX (C) locale does. > > > > > A separate aspect is documentation. I imagine Chet wouldn't mind > > > a patch to bash.1 and bash.info to explain this pitfall under > > > "Pattern Matching" or even under "BUGS" (aka LIMITATIONS). > > > > This is not a bug, so it does not belong in BUGS. > > > > The first place I found in the man page that makes mention of this is > > the Pathname Expansion section. This, I agree, should be changed. > > Perhaps this would an acceptable wording: > > > > > > --- doc/bash.1.orig Mon May 2 08:31:26 2011 > > +++ doc/bash.1 Mon May 2 08:35:51 2011 > > @@ -3121,8 +3121,8 @@ > > If one of these characters appears, then the word is > > regarded as a > > .IR pattern , > > -and replaced with an alphabetically sorted list of > > -file names matching the pattern. > > +and replaced with a list of file names matching the pattern, > > +sorted alphabetically by the current locale's collating sequence. > > If no matching file names are found, > > and the shell option > > .B nullglob > > > > > > Under Pattern Matching, there is already an explanation of how it uses > the > > LC_COLLATE variable, the current locale, etc. It's all there. In fact, > > since Pattern Matching is a subsection of Pathname Expansion, one could > > argue that my patch is redundant, but since the pathname expansion stuff > > appears first, someone may stop reading before encountering the more > > verbose description, so IMHO it doesn't hurt to correct the introduction. > > -- > Bruno Ribas - ri...@inf.ufpr.br > http://www.inf.ufpr.br/ribas > > Alternatively, just use [[:upper:]], [[:lower:]], etc. They are considered locale-safe.