Re: documentation bug re character range expressions

2011-06-03 Thread Marcel (Felix) Giannelia
Is it really a programmer mistake, though, to assume that [A-Z] is only capital letters? A through Z are a contiguous range in every representation system except EBCDIC, and it is even contiguous the modern unicode. In the world of programming characters are numbers, and programmers know this

Re: documentation bug re character range expressions

2011-06-03 Thread Aharon Robbins
This is a thorny issue that plagues all POSIX-compliant utilities, not just Bash. (POSIX locales are just a blight.) For gawk 4.0, I have said "to heck with it" and changed gawk so that ranges act like they are in the C locale (unless --posix is used). I and some other people are campaigning to

Re: documentation bug re character range expressions

2011-06-03 Thread Greg Wooledge
On Fri, Jun 03, 2011 at 12:06:32AM -0700, Marcel (Felix) Giannelia wrote: > Is it really a programmer mistake, though, to assume that [A-Z] is only > capital letters? Yes, it is. You should be using [[:upper:]], or you should be setting LC_COLLATE=C if you insist on using [A-Z].

Re: documentation bug re character range expressions

2011-06-03 Thread Greg Wooledge
Oh, look, there's more! On Fri, Jun 03, 2011 at 12:06:32AM -0700, Marcel (Felix) Giannelia wrote: > [[:alpha:]] is too difficult to type to make it useful for the kind of > quick pattern-matching that character ranges are used for on the > interactive shell. Try it. Open-bracket, colon is an awk

Re: Bash source repository

2011-06-03 Thread Bradley M. Kuhn
Chet Ramey wrote at 15:59 (EDT) on Thursday: > I think there should be a master branch, and a branch that includes > posted patches other than those that have been "officially released." > Then other branches as needed to accommodate developers. I think that could work fine; I'm happy to do my be

Re: documentation bug re character range expressions

2011-06-03 Thread Marcel (Felix) Giannelia
On 2011-06-03 05:09, Greg Wooledge wrote: Oh, look, there's more! [...] See? Both tr(1) and ls(1) do it too! Right; forgot about ls (because "alias ls='LC_COLLATE=C ls'" has been in my .bashrc for so long that I completely forgot it was there :) ), and didn't think to try tr -- but tr appe

Re: documentation bug re character range expressions

2011-06-03 Thread Marcel (Felix) Giannelia
On 2011-06-03 05:00, Greg Wooledge wrote: On Fri, Jun 03, 2011 at 12:06:32AM -0700, Marcel (Felix) Giannelia wrote: Is it really a programmer mistake, though, to assume that [A-Z] is only capital letters? Yes, it is. You should be using [[:upper:]], or you should be setting LC_COLLATE=C if yo

Re: documentation bug re character range expressions

2011-06-03 Thread Eric Blake
On 06/03/2011 10:15 AM, Marcel (Felix) Giannelia wrote: > Alright -- assuming that for the moment, how does one specify > [ABCDEFGHIJKL] using [[:upper:]]? This is something that I haven't seen > documented, and I'm genuinely curious. [ABCDEFGHIJKL] If you ever want a subset of [[:upper:]], the _

Re: documentation bug re character range expressions

2011-06-03 Thread Greg Wooledge
On Fri, Jun 03, 2011 at 09:15:55AM -0700, Marcel (Felix) Giannelia wrote: > Alright -- assuming that for the moment, how does one specify > [ABCDEFGHIJKL] using [[:upper:]]? This is something that I haven't seen > documented, and I'm genuinely curious. You can't. Either write out [ABCDEFGHIJKL]

Re: documentation bug re character range expressions

2011-06-03 Thread Greg Wooledge
On Fri, Jun 03, 2011 at 09:12:07AM -0700, Marcel (Felix) Giannelia wrote: > And yours looks broken -- how does > echo Hello World | tr A-Z a-z > result in a bunch of non-ASCII characters? I explain it in a bit on http://mywiki.wooledge.org/locale In a bit more depth: in ASCII, the characters A-Z

Re: Bash source repository

2011-06-03 Thread Michael Witten
On Fri, Jun 3, 2011 at 15:27, Bradley M. Kuhn wrote: > Sorry, I was imprecise in my wording in my email yesterday.  By "use > separate branches for individual developers", I meant that "branches > would be created for those developers who wanted to develop publicly in > a Git repository". Such de

bug in 'set -n' processing

2011-06-03 Thread Eric Blake
Bash has a bug: ${+} is syntactically invalid, as evidenced by the error message when running the script, yet using 'set -n' was not able to flag it as an error. $ echo $BASH_VERSION 4.2.8(1)-release $ bash -c 'echo ${+}'; echo $? bash: ${+}: bad substitution 1 $ bash -cn '${+}'; echo $? 0 $ ksh -

Re: documentation bug re character range expressions

2011-06-03 Thread Marcel (Felix) Giannelia
On Fri, June 3, 2011 10:03, Greg Wooledge wrote: > On Fri, Jun 03, 2011 at 09:12:07AM -0700, Marcel (Felix) Giannelia wrote: > > [...] > > In HP-UX's en_US.iso88591 locale, the characters are in a COMPLETELY > different order. You can't easily figure out what that order is, because > it's not docu

Re: documentation bug re character range expressions

2011-06-03 Thread Eric Blake
On 06/03/2011 11:36 AM, Marcel (Felix) Giannelia wrote: > It sounds to me like what you're saying is, the *only* uses of bracket > range expressions guaranteed to be "portable" are things like [[:upper:]] > and [[:lower:]]. But I put "portable" in quotation marks just then, > because to my mind the