[PATCH] confusing/obsolete handling of test -t operator (and doc warnings against using -o/-a)
Hello, test -t X Always returns false and doesn't report an error about that invalid number (beside the point here, but in ksh/zsh, that X is treated as an arithmetic expression and evaluates to 0 if $X is not set). While: test -t X -a Y returns a "too many arguments" error. test X -a -t -a Y returns false (without error and regardless of whether any fd is a tty) while test X -a Y -a -t returns true While for other unary operators that gives: $ bash -c 'test X -a -x -a Y' bash: line 1: test: too many arguments No big deal as in all those the behaviour is unspecfied by POSIX (non-numeric argument to -t, or more than 4 arguments, -a deprecated). It seems to be explained by what looks like a remnant from the time where [ -t ] was short for [ -t 1 ] in the code of unary_operator() in test.c > /* the only tricky case is `-t', which may or may not take an argument. */ Not anymore. > if (op[1] == 't') >{ > advance (0); > if (pos < argc) > { > if (legal_number (argv[pos], &r)) > { > advance (0); > return (unary_test (op, argv[pos - 1], 0)); > } > else > return (FALSE); Maybe the intention was to do a isatty(1) here instead of always returning false, but that and the fact that advance() is not called only confuses things. > } > else > return (unary_test (op, "1", 0)); That part is never reached AFAICT as unary_operator() is never called with pos == argc. I beleive that whole code can go as -t is now always a unary operator, and it would be more useful to report an error when the operand is not a number. I also noticed that the fact that -a/-o were deprecated (by POSIX at least) and made for unreliable test expressions was not noted in the manual. So I suggest the patch below: diff --git a/doc/bashref.texi b/doc/bashref.texi index 85e729d5..00fbab69 100644 --- a/doc/bashref.texi +++ b/doc/bashref.texi @@ -4215,14 +4215,14 @@ Operator precedence is used when there are five or more arguments. @item ! @var{expr} True if @var{expr} is false. -@item ( @var{expr} ) +@item ( @var{expr} ) (DEPRECATED) Returns the value of @var{expr}. This may be used to override the normal precedence of operators. -@item @var{expr1} -a @var{expr2} +@item @var{expr1} -a @var{expr2} (DEPRECATED) True if both @var{expr1} and @var{expr2} are true. -@item @var{expr1} -o @var{expr2} +@item @var{expr1} -o @var{expr2} (DEPRECATED) True if either @var{expr1} or @var{expr2} is true. @end table @@ -4283,11 +4283,26 @@ Otherwise, the expression is parsed and evaluated according to precedence using the rules listed above. @end enumerate -@item 5 or more arguments +@item 5 or more arguments (DEPRECATED) The expression is parsed and evaluated according to precedence using the rules listed above. @end table +In the 4 or 5 arguments case, the use of @samp{(}, @samp{)}, binary +@samp{-a}, binary @samp{-o} make for unreliable test expressions. For +instance @code{test "$x" -a ! "$y"} becomes a test for whether a +@samp{!} file exists if @code{$x} is @samp{(} and @code{$y} is +@samp{)} and @code{[ -f "$file" -a ! -L "$file" ]} fails with a +syntax error for a file called @samp{==}. Which explains why those +are deprecated as they have been in the POSIX specification of the +@code{test} utility since 2008. + +Each invocation of @code{[} / @code{test} should perform a single test +and several invocations may be chained with the @code{&&} or @code{||} +shell operators to achieve the same result as the @code{-a} and +@code{-o} operators reliably as in @code{test "$x" && test ! "$y"} or +@code{[ -f "$file" ] && [ ! -L "$file" ]} in the examples above. + When used with @code{test} or @samp{[}, the @samp{<} and @samp{>} operators sort lexicographically using ASCII ordering. diff --git a/test.c b/test.c index 2b12197a..e16337a5 100644 --- a/test.c +++ b/test.c @@ -476,24 +476,6 @@ unary_operator (void) if (test_unop (op) == 0) return (FALSE); - /* the only tricky case is `-t', which may or may not take an argument. */ - if (op[1] == 't') -{ - advance (0); - if (pos < argc) - { - if (legal_number (argv[pos], &r)) - { - advance (0); - return (unary_test (op, argv[pos - 1], 0)); - } - else - return (FALSE); - } - else - return (unary_test (op, "1", 0)); -} - /* All of the unary operators take an argument, so we first call unary_advance (), which checks to make sure that there is an argument, and then advances pos right past it. This means that @@ -603,7 +585,7 @@ unary_test (char *op, char *arg, int flags) case 't': /* File fd is a terminal? */ if (legal_number (arg, &r) == 0) - return (FALSE); + integer_expected_error (arg); return ((r == (int)r) && isatty ((int)r)); case 'n': /* True if arg has some length. */
Re: [PATCH] confusing/obsolete handling of test -t operator (and doc warnings against using -o/-a)
2023-07-07 15:52:28 -0400, Chet Ramey: [...] > Historical versions of test made the argument to -t optional here. I can > continue to support that in default mode for backwards compatibility, but > it will be an error in posix mode. [...] I think you may have overlooked the bottom part of my email (possibly because it was hidden by your MUA as it included quoted text) that included comments on the code and a patch. bash hasn't supported [ -t ] as an alias for [ -t 1 ] since 2.02 and possibly earlier AFAICT since it started supporting the POSIX rules where [ any-non-empty-single-argument ] returns true, and having [ -t ] to check whether stdout is a terminal is not allowed. The problem here is that some code to support that haven't been removed at the time the POSIX rules were implemented. The patch I suggested just removes that code. ksh93 does support [ -t ] when the -t is literal: $ ksh93 -c '[ -t ]' > /dev/null || echo stdout is not a terminal stdout is not a terminal $ ksh93 -c '[ "-t" ]' > /dev/null || echo stdout is not a terminal stdout is not a terminal $ var=-t ksh93 -c '[ "$var" ]' > /dev/null && echo '$var is non-empty' $var is non-empty But there's no point going there since that breaks POSIX compliance for no good reason as [ -t ] as an alias for [ -t 1 ] hasn't been supported for decades so scripts that were doing [ -t ] would have long been fixed to [ -t 1 ]. [...] > > I also noticed that the fact that -a/-o were deprecated (by POSIX at > > least) and made for unreliable test expressions was not noted in the > > manual. So I suggest the patch below: > > I added some language about this, noting that POSIX has deprecated them > and recommending scripts not use them. Thanks for the suggestion. [...] Note that "(" and ")" are also obsoleted by POSIX and as a result any usage of test with 5 or more arguments (hence why I flagged them as (DEPRECATED) in the doc patch I was suggesting. -- Stephane
Re: RFC: changing printf(1) behavior on %b
2023-08-31 10:35:59 -0500, Eric Blake via austin-group-l at The Open Group: > In today's Austin Group call, we discussed the fact that printf(1) has > mandated behavior for %b (escape sequence processing similar to XSI > echo) that will eventually conflict with C2x's desire to introduce %b > to printf(3) (to produce 0b000... binary literals). [...] Is C2x's %b already set in stone? ksh93's printf (and I'd expect ast's standalone printf) has %[,[,]d to output a number in an arbitrary base which IMO seems like a better approach than introducing a new specifier for every base. $ printf '%..2d\n' 63 11 $ printf '0b%.8.2d\n' 63 0b0011 $ printf '%#.8.2d\n' 63 2#0011 The one thing it can't do though is left-space-padding of 0b. printf %b is used in countless scripts especially the more correct/portable ones that use it to work around the portability fiasco that is echo's escape sequence expansion. I can't imagine it going away. Hard to imagine the C folks overlooked it, I'd expect printf %b to be known by any shell scripter. -- Stephane
Re: RFC: changing printf(1) behavior on %b
2023-09-01 07:13:36 +0100, Stephane Chazelas via austin-group-l at The Open Group: > 2023-08-31 10:35:59 -0500, Eric Blake via austin-group-l at The Open Group: > > In today's Austin Group call, we discussed the fact that printf(1) has > > mandated behavior for %b (escape sequence processing similar to XSI > > echo) that will eventually conflict with C2x's desire to introduce %b > > to printf(3) (to produce 0b000... binary literals). > [...] > > Is C2x's %b already set in stone? > > ksh93's printf (and I'd expect ast's standalone printf) has > %[,[,]d to output a number in an > arbitrary base which IMO seems like a better approach than > introducing a new specifier for every base. [...] For completeness, several shells also support expanding integers in arbitrary bases. Like ksh's typeset -i2 binary=123 already there in ksh85, possibly earlier, also available in pdksh and derivatives and zsh. Originally with the base number not specified the output base was derived from the first assignment like typeset -i var; var='2#111' would get you a $var that expands in binary. Looks like that was discontinued in ksh93, but it's still there in mksh or zsh. And there's also: $ echo $(( [#2] 16 )) $(( [##2] 16 )) 2#1 1 In zsh (note that you don't get 0b1 upon $(( [#2] 16 )) after set -o cbases). If bash added: printf -v var %..2 16 à la ksh93, that would bridge that gap. How to output/expand numbers in bases other thn 8, 10, 16 is a recurring question for bash, with people generally surprised that it can *input* numbers in any base, but not *output* in any base. See https://unix.stackexchange.com/questions/415077/how-to-add-two-hexadecimal-numbers-in-a-bash-script/415107#415107 https://unix.stackexchange.com/questions/616215/bash-arithmetic-outputs-result-in-decimal https://unix.stackexchange.com/questions/749988/arbitrary-base-conversion-from-base-10-using-only-builtins-in-bash to list only a few. -- Stephane
Re: RFC: changing printf(1) behavior on %b
2023-09-01 09:44:08 +0300, Oğuz via austin-group-l at The Open Group: > On Fri, Sep 1, 2023 at 7:41 AM Phi Debian wrote: > > My vote is for posix_printf %B mapping to libc_printf %b > > In the shell we already have bc for base conversion. Does POSIX really > have to support C2x %b in the first place? Yes, though note: - that implies forking a process and loading an external executable and its libraries - bc is not always available. It's not installed by default on Debian for instance. - for bases over 16, it uses some unusual representation that can't be used anywhere. A summary of some options for some common POSIX-like shells at https://unix.stackexchange.com/questions/191205/bash-base-conversion-from-decimal-to-hex/191209#191209 -- Stephane
Re: bug#65659: RFC: changing printf(1) behavior on %b
2023-08-31 15:02:22 -0500, Eric Blake via austin-group-l at The Open Group: [...] > The current POSIX says that %b was added so that on a non-XSI > system, you could do: > > my_echo() { > printf %b\\n "$*" > } That is dependant on the current value of $IFS. You'd need: xsi_echo() ( IFS=' ' printf '%b\n' "$*" ) Or the other alternatives listed at https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo/65819#65819 [...] > Bash already has shopt -s xpg_echo Note that in bash, you need both shopt -s xpg_echo set -o posix To get a XSI echo. Without the latter, options are still recognised. You can get a XSI echo without those options with: xsi_echo() { local IFS=' ' - set +o posix echo -e "$*\n\c" } The addition of those \n\c (noop) avoids arguments being treated as options if they start with -. [...] > The Austin Group also felt that standardizing bash's behavior of %q/%Q > for outputting quoted text, while too late for Issue 8, has a good > chance of success, even though C says %q is reserved for > standardization by C. Our reasoning there is that lots of libc over > the years have used %qi as a synonym for %lli, and C would be foolish > to burn %q for anything that does not match those semantics at the C > language level; which means it will likely never be claimed by C and > thus free for use by shell in the way that bash has already done. [...] Note that %q is from ksh93, not bash and is not portable across implementations and with most including bash's gives an output that is not safe for reinput in arbitrary locales (as it uses $'...' in some cases), not sure it's a good idea to add it to the standard, or at least it should come with fat warnings about the risk in using it. See also: https://unix.stackexchange.com/questions/379181/escape-a-variable-for-use-as-content-of-another-script/600214#600214 -- Stephane
Re: bug#65659: RFC: changing printf(1) behavior on %b
2023-09-01 07:15:14 -0500, Eric Blake: [...] > > Note that in bash, you need both > > > > shopt -s xpg_echo > > set -o posix > > > > To get a XSI echo. Without the latter, options are still > > recognised. You can get a XSI echo without those options with: > > > > xsi_echo() { > > local IFS=' ' - > > set +o posix > > echo -e "$*\n\c" > > } > > > > The addition of those \n\c (noop) avoids arguments being treated as > > options if they start with -. > > As an extension, Bash (and Coreutils) happen to honor \c always, and > not just for %b. But POSIX only requires \c handling for %b. > > And while Issue 8 has taken steps to allow implementations to support > 'echo -e', it is still not standardized behavior; so your xsi_echo() > is bash-specific (which is not necessarily a problem, as long as you > are aware it is not portable). [...] Yes, none of local (from ash I believe), the posix option (several shells have an option called posix all used to improve POSIX conformance, bash may have been the first) nor -e (from Research Unix v8) are standard, that part was about bash specifically (as the thread is also posted on gnu.bash.bug). BTW, that xsi_echo is not strictly equivalent to a XSI echo in the case where the last character of the last argument is an unescaped backslash or a character whose encoding ends in the same byte as the encoding of backslash. -- Stephane
Re: bug#65659: RFC: changing printf(1) behavior on %b
2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group: [...] > > Well in all case %b can not change semantic in the bash script, since it is > > there for so long, even if it depart from python, perl, libc, it is > > unfortunate but that's the way it is, nobody want a semantic change, and on > > next routers update, see the all internet falling appart :-) > > How many scripts in the wild actually use %b, though? And if there > are such scripts, anything we can do to make it easy to do a drop-in > replacement that still preserves the old behavior (such as changing %b > to %#s) is going to be easier to audit than the only other > currently-portable alternative of actually analyzing the string to see > if it uses any octal or \c escapes that have to be re-written to > portably function as a printf format argument. [...] FWIW, a "printf %b" github shell code search returns ~ 29k entries (https://github.com/search?q=printf+%25b+language%3AShell&type=code&l=Shell) That likely returns only a small subset of the code that uses printf with %b inside the format and probably a few false positives, but that gives many examples of how printf %b is used in practice. printf %b is also what all serious literature about shell scripting has been recommending to use in place of the unportable echo -e (or XSI echo, or print without -r). That includes the POSIX standard which has been recommending using printf instead of the non-portable echo for 30 years. So that change will also invalidate all those. It will take a while before %#s is supported widely enough that %b can be safely replaced with %#s -- Stephane
Re: bug#65659: RFC: changing printf(1) behavior on %b
2023-09-01 23:28:50 +0200, Steffen Nurpmeso via austin-group-l at The Open Group: [...] > |FWIW, a "printf %b" github shell code search returns ~ 29k > |entries > |(https://github.com/search?q=printf+%25b+language%3AShell&type=code&l=Sh\ > |ell) > | > |That likely returns only a small subset of the code that uses > |printf with %b inside the format and probably a few false > |positives, but that gives many examples of how printf %b is used > |in practice. > > Actually this returns a huge amount of false positives where > printf(1) and %b are not on the same line, let alone the same > command, if you just scroll down a bit it starts like neovim match [...] You're right, I only looked at the first few results and saw that already gave interesting ones. Apparently, we can also search with regexps and searching for printf.*%b (https://github.com/search?q=%2Fprintf.*%25b%2F+language%3AShell&type=code) It's probably a lot more accurate. It returns ~ 19k. (still FWIW, that's still just a sample of random code on the internet) [...] > Furthermore it shows a huge amount of false use cases like > > printf >&2 "%b\n" "The following warnings and non-fatal errors were > encountered during the installation process:" [...] Yes, I also see a lot of echo -e stuff that should have been echo -E stuff (or echo alone in those (many) implementations that don't expand by default or use the more reliable printf with %s (not %b)). > It seems people think you need this to get colours mostly, which > then, it has to be said, is also practically mislead. (To the > best of *my* knowledge that is.) [...] Incidentally, ANSI terminal colour escape sequences are somewhat connecting those two %b's as they are RGB (well BGR) in binary (white is 7 = 0b111, red 0b001, green 0b010, blue 0b100), with: R=0 G=1 B=1 printf '%bcyan%b\n' "\033[3$(( 2#$B$G$R ))m" '\033[m' (with Korn-like shells, also $(( 0b$B$G$R )) in zsh though zsh has builtin colour output support including RGB-based). Speaking of stackexchange, on the June data dump of unix.stackexchange.com: stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%b' 494 (FWIW) Compared with %d (though that will have entries for printf(3) as well): stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%d' 3444 -- Stephane
Re: cd with multiple arguments?
2010-12-13, 12:04(+00), Marc Herbert: [...] >> True, bash does not come with a vast library of ready-to-use functions. > > Neither is there any such "reference" library available externally. Or > is there? Yes, they're called "commands" or "programs" or "software". People tend to forget that before all a shell is a command line interpreter. If you're finding yourself writing complex shell functions that don't make use of external commands, then you're heading the wrong direction with your shell, or you are heading in one direction with the wrong medium (i.e. you need a programming language, not a shell). -- Stephane
Re: backslash required for [\/]
2010-12-25, 20:30(+08), jida...@jidanni.org: [...] > While we are here, here's another interesting phenomena, > $ touch aa > $ echo a[ab] > aa > $ echo a[ab/] > a[ab/] > $ echo a[ab\/] > a[ab/] > $ mkdir ab #doesn't help... That's the same as in every other shell and as specified by POSIX. So, any other behavior would be a bug. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_03 -- Stephane
Re: read builtin does not return success code (0) in spite of it successfully read line
2010-12-24, 13:20(+05), Коренберг Марк: [...] > mma...@mmarkk-work:/tmp$ echo aaa > /tmp/test1 > mma...@mmarkk-work:/tmp$ read x < /tmp/test1 && echo success > success > mma...@mmarkk-work:/tmp$ echo $x > aaa > mma...@mmarkk-work:/tmp$ > > BUGGY CASE: > mma...@mmarkk-work:/tmp$ echo -n bbb > /tmp/test2 > mma...@mmarkk-work:/tmp$ read y < /tmp/test2 && echo success > mma...@mmarkk-work:/tmp$ echo $y > bbb > mma...@mmarkk-work:/tmp$ > > So, if EOF detected but something is read successfully, please do not > report error ! [...] Bash behavior is the same as every other shell, is as documented and as specified by POSIX. if read can't read a line (because of a read error or EOF found before the line terminator is found) it should return a non-zero exit status and fill the variable with the caracters read thus far. -- Stephane
Re: read not terminated with EOF/^D when -n or -N is used
2010-12-26, 18:19(-05), Isaac Good: [...] > read <^D> -> the EOF terminates read and it exists with a false > exist status. > read -n4 <^D> -> the ^D is read by read and stored in REPLY. read > does not terminate. It's not bash that handles the ^D, it's the terminal. When in canonical mode, upon pressing ^D, the terminal emmits "EOF" (a read(2) returns). Though it's not documented, when using "-n", read puts the terminal in raw mode. In that mode, the ^D is not special (it is sent as is, so read by bash's read). In zsh (where -n is -k), that behavior (putting the terminal in raw mode) can be disabled by adding a "-u" flag, as in -u0 to specify the fd to read from. It doesn't look like bash has a similar feature, so you'd have to do: read -n4 < <(cat) for instance, so that read sees its stdin as a pipe and not a terminal. Note that when in canonical mode, the terminal only sends characters to the application (here cat) upon pressing or , so read won't return after you enter 4 characters. -- Stephane
Re: $? in the right side of a pipe
2010-12-27, 13:14(+01), : > Hello > > More information on that topic : > > false ; false | echo $? stills prints 0. > false ; (false) | echo $? prints 1 > > So.. ? $? in the right side of a pipe is randomly the exit > status of the left side depending of the way you write it ? > Doesn’t sound sane. That's not the exit status of the left side. It couldn't be as both sides of the pipe are started at the same time and run concurrently. > Doesn’t that break POSIX anyway ? I suppose it does. I can't see any reason why the above shouldn't print 1 in any case. > I think it should be fixed. Me too. Note that $PIPESTATUS doesn't seem to be affected by that bug. $ bash -c '(exit 2) | (exit 3) ; false | echo $?, "${pipestat...@]}"' 0, 2 3 -- Stephane
Re: read builtin does not return success code (0) in spite of it successfully read line
2010-12-27, 09:43(-07), Eric Blake: [...] > On 12/26/2010 01:29 PM, Stephane CHAZELAS wrote: >> Bash behavior is the same as every other shell, is as documented >> and as specified by POSIX. > > POSIX requires that the input to read be a text file. Since you aren't > passing a text file, the behavior is undefined. POSIX does NOT require > bash to return failure in this case, but neither does it require bash to > return success. You should not rely on the behavior of read when the > input does not end in a newline. [...] From http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html SUSv4> EXIT STATUS SUSv4> SUSv4> The following exit values shall be returned: SUSv4> SUSv4> 0 SUSv4> Successful completion. SUSv4> >0 SUSv4> End-of-file was detected or an error occurred. SUSv4> So I think you can expect a non-zero status here. -- Stephane
Re: read builtin does not return success code (0) in spite of it successfully read line
2010-12-27, 11:53(-07), Eric Blake: [...] >> So I think you can expect a non-zero status here. > > All you can expect is that you have undefined results. Undefined > results include zero status. [...] Yes, sorry, you're right. Bad reasoning of mine. The >0 on EOF is oviously on end of text file, that is when there's nothing left to be read after the last newline character. On a terminal, you'd express that EOF by entering the eof character on an empty buffer. Having said that, all the implementations are returning a non-zero exit status on an unfinished line and it also sounds to me like the best option (as it allows to diffentiate a terminated line from an unfinished one and can also be seen as an error condition), so it would be unwise IMO for bash to change its behavior to return 0 in this case. -- Stephane
Re: Bug in shell: buffer overflow.
2010-12-31, 11:33(-07), Bob Proulx: [...] > Your expressions above are overflowing the value of your system's > maximum integer size. You can read the system's maximum integer size > using getconf. > > $ getconf INT_MAX > 2147483647 [...] POSIX requires that arithmetic expansion be using at least signed longs, so getconf INT_MAX wouldn't necessarily be correct. -- Stephane
Re: Inconsitent treatment of * (globbing)
2011-01-31, 11:44(-05), Greg Wooledge: > On Sat, Jan 29, 2011 at 12:58:56PM +0100, Pascal Wittmann wrote: >> But if I use the expression ".*" in bash, I would expect from the >> behaviour of "*", that >> ".*" don't include the file "..". But the fact is, that it does in bash. > > This is done for compatibility with all the other shells. [...] All the other shells... but zsh and pdksh based ones (mksh, posh...) and shells like ksh93 can be globally modified by setting the FIGNORE environment variable (a bug and potential security vulnerability if you ask me). I agree with the OP that .* expanding "." and ".." is not useful but I suspect it is required by POSIX. -- Stephane
Re: Document what's allowed in function names
2011-04-11, 09:51(-04), Chet Ramey: >> Machine Type: i686-pc-linux-gnu >> >> Bash Version: 4.2 >> Patch Level: 8 >> Release Status: release >> >> Description: >> man bash is currently lacking information on what is allowed for >> function >> names. It implies name with name () compound-command [redirection] and >> at the start of the manual there is: >> >> name A word consisting only of alphanumeric characters and >> underscores, >> and beginning with an alphabetic character or an underscore. Also >> referred >> to as an identifier. >> >> In reality the rules for function names are much more loose. For example >> hyphen and forward slash are allowed. But there are still some >> restrictions: >> bash: `aa\'foo': not a valid identifier > > It was a mistake to allow such characters in function names (`unset' doesn't > work to unset them without forcing -f, for instance). We're stuck with them > for backwards compatibility, but I don't have to encourage their use. [...] Why would you put any restriction on the allowed name of a function? At calling time, it's the same namespace as an argv[0] so any arg so could be any string. In zsh, any string is allowed including the empty string (and allowing the NUL character as that is allowed in arguments to functions or built-in commands). $ ''()echo foo $ "" foo $ /bin/ls() echo bar $ /bin/ls bar -- Stephane
Re: Document what's allowed in function names
2011-05-24 17:23:20 -0400, Chet Ramey: [...] > > Why would you put any restriction on the allowed name of a function? [...] > Because Posix does, and because unset without -f has to enforce the variable > name restrictions. (Though the language has been relaxed in the latest > standard.) [...] POSIX puts restrictions on the name of functions used by applications, but I can't see anywhere it restricting what the shell allows. 'foo bar'() { baz } being not a valid POSIX syntax, I can't imaging POSIX forcing a shell to do this (define a "foo bar" function) or that (return an error message) upon reading it. I may be wrong though, is there any specific section of the standard you had in mind? Note that all of pdksh, mksh, zsh (at least) allow foo+bar() { w; } for instance ksh93 allows foo.get() { w; } I also find it unfortunate that bash doesn't allow foo() bar given that every other Bourne-like shell (including the Bourne shell) supports it (again, it's not POSIX syntax, so bash doesn't have to implement it, -- Stephane
usenet to email gateway
Hiya, not sure who to report that to, but it looks like the usenet to mailing-list gateway at least for gnu.bash.bug is not working properly. That is, messages posted to the newsgroup are not sent to the corresponding mailing list. Compare for instance http://thread.gmane.org/gmane.comp.shells.bash.bugs/16950 (mailing list) with http://groups.google.com/group/gnu.bash.bug/browse_thread/thread/6358da68e2c8d4e3 (usenet) All my messages (like http://groups.google.com/group/gnu.bash.bug/msg/a458eb453deb9fce?dmode=source) are missing on gmane. Best regards, Stephane
Re: usenet to email gateway
2011-08-10 23:05:26 +, Karl Berry: > not sure who to report that to > > I've asked the FSF sysadmins. I can see from the mailing list > configuration that the gateway is intended to be operational, but don't > know how to debug what it happening from there. [...] Thanks Karl. My last successful post to gnu.bash.bug was on 2011-04-12 15:30:20 GMT http://thread.gmane.org/gmane.comp.shells.bash.bugs/16603 and Chet's answer suggests it made its way to the mailing list only months later. I'll repost my messages to the mailing list and try and cancel them on usenet. In the meantime, it might be worth pointing out that posting to the gnu.bash.bug newsgroup doesn't work (well it works, but only usenet users see the messages, not those subscribed to the mailing list only). -- Stephane
Re: Bash source repository
2011-05-30, 06:34(+00), Michael Witten: [...] > Ben, it is generally a good idea to maintain the `Cc' list unless > explicitly asked. Beware that the bash mailing list also has a usenet interface (gnu.bash.bug) which has no notion of recipients let alone Cc. -- Stephane
Re: bug: return doesn't accept negative numbers
2011-08-08, 13:55(-07), Linda Walsh: [...] > and both 'exit' and 'return' should return error "ERANGE" if "--posix" is > set, and -1 is given. Iinvalid option doesn't make as much sense, in > this situtation, if it was -k or -m, sure...but in this case, it's a fact > that --posix artificially limits exit values apart from what is allowed in > most prog langs (which accept negative, but still return results &0xff), > so for Posix, it's a matter of disallowing a 'normal range', vs. it being > an invalid option [...] POSIX doesn't prevent a shell from accepting -1 (or for doing anything like eject a cd or output an error or turn red upon "return -1"). It just says an *application* should not use "return -1", that is that if one wants to write a portable script, she shouldn't use "return -1". Many POSIX shells accept "return -1" $ ksh93 -c 'f() return -1; f; echo $?' 255 $ pdksh -c 'f() return -1; f; echo $?' -1 $ zsh -c 'f() return -1; f; echo $?' -1 $ posh -c 'f() return -1; f; echo $?' return: invalid option -- '1' 1 $ posh -c 'f() return -- -1; f; echo $?' -1 $ mksh -c 'f() return -1; f; echo $?' mksh: return: -1: unknown option 1 $ mksh -c 'f() return -- -1; f; echo $?' -1 But as you can see the result varies, so one shouldn't use "return -1" if one wants to be portable accross POSIX shells. Also note: $ zsh -c 'f() return -1; f; echo $?' -1 $ zsh -c 'f() return -1; (f); echo $?' 255 That is even in shells that support arbitrary numbers for return, as soon as they are cast to exit status, they are &255ed. -- Stephane
Re: feature request: option to start script not from the very beginning of file
2011-08-02, 17:05(+04), Dmitry Bolshakov: [...] > perl has "-x" switch which makes it skip leading file contents until the > #!/bin/perl > line > > imho it would be good to have the same feature in bash What about: #! /bin/bash - :||:<<\#__END__ whatever you like here #__END__ your script starts here -- Stephane
Re: equivalent of Linux readlink -f in pure bash?
2011-08-9, 11:44(+10), Jon Seymour: > Has anyone ever come across an equivalent to Linux's readlink -f that > is implemented purely in bash? > > (I need readlink's function on AIX where it doesn't seem to be available). [...] What about: readlink_f() ( link=$1 max_iterations=40 while [ "$max_iterations" -gt 0 ]; do max_iterations=$(($max_iterations - 1)) dir=$(dirname -- "$link") || exit base=$(basename -- "$link") || exit dir=$(cd -P -- "$dir" && pwd -P) || exit link=${dir%/}/$base if [ ! -L "$link" ]; then printf '%s\n' "$link" exit fi link=$(ls -ld -- "$link") || exit link=${link#* -> } done printf >&2 'Loop detected\n' exit 1 ) -- Stephane
Re: [OT] Re: feature request: option to start script not from the very beginning of file
2011-08-02, 23:41(+02), mhenn: > Am 02.08.2011 15:55, schrieb Stephane CHAZELAS: [...] >> What about: >> >> #! /bin/bash - >> :||:<<\#__END__ >> >> whatever you like here >> >> #__END__ [...] > Why did you use :||:<< ... > and not just :<<... ? > > when testing it, it doesn't make any difference: [...] Without ":||", it does create a temporary file, fill it and set it as standard input to the ":" command. With :||:, that does not happen because the second ":" is not run since the first one succeeded. $ strace bash -c $':<<\\E\nE' |& grep tmp stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=12288, ...}) = 0 stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=12288, ...}) = 0 access("/tmp", W_OK)= 0 statfs("/tmp", {...}) = 0 open("/tmp/sh-thd-1312368098", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC, 0600) = 3 open("/tmp/sh-thd-1312368098", O_RDONLY) = 4 unlink("/tmp/sh-thd-1312368098")= 0 $ strace bash -c $':||:<<\\E\nE' |& grep tmp $ -- Stephane
Re: equivalent of Linux readlink -f in pure bash?
2011-08-9, 09:24(+00), Stephane CHAZELAS: > 2011-08-9, 11:44(+10), Jon Seymour: >> Has anyone ever come across an equivalent to Linux's readlink -f that >> is implemented purely in bash? >> >> (I need readlink's function on AIX where it doesn't seem to be available). > [...] > > What about: > > readlink_f() ( > link=$1 max_iterations=40 > while [ "$max_iterations" -gt 0 ]; do > max_iterations=$(($max_iterations - 1)) > dir=$(dirname -- "$link") || exit > base=$(basename -- "$link") || exit > dir=$(cd -P -- "$dir" && pwd -P) || exit > link=${dir%/}/$base > if [ ! -L "$link" ]; then > printf '%s\n' "$link" > exit > fi > link=$(ls -ld -- "$link") || exit > link=${link#* -> } > done > printf >&2 'Loop detected\n' > exit 1 > ) Sorry, it's wrong if there are relative paths in symlinks (or trailing newlines). fixed_cmd_subst() { eval ' '"$1"'=$('"$2"'; ret=$?; echo .; exit "$ret") set -- "$1" "$?" '"$1"'=${'"$1"'%??} ' return "$2" } readlink_f() ( link=$1 max_iterations=40 while [ "$max_iterations" -gt 0 ]; do max_iterations=$(($max_iterations - 1)) fixed_cmd_subst dir 'dirname -- "$link"' || exit fixed_cmd_subst base 'basename -- "$link"' || exit cd -P -- "$dir" || exit link=${PWD%/}/$base if [ ! -L "$link" ]; then printf '%s\n' "$link" exit fi fixed_cmd_subst link 'ls -ld -- "$link"' || exit link=${link#* -> } done printf >&2 'Loop detected\n' exit 1 ) -- Stephane
Re: equivalent of Linux readlink -f in pure bash?
2011-08-09, 11:29(+02), Bernd Eggink: > On 09.08.2011 03:44, Jon Seymour wrote: >> Has anyone ever come across an equivalent to Linux's readlink -f that >> is implemented purely in bash? > > You can find my version here: > > http://sudrala.de/en_d/shell-getlink.html > > As it contains some corrections from Greg Wooledge, it should handle > even pathological situations. ;) [...] > function getlink # ([-l] path) Why use the ksh syntax instead of the standard one? > { > # Path of the file a symbolic link is pointing to. > # -l: follow link chain, print last target > # no option: print 1st target unchanged > > typeset dir file last link opt oldPWD=$PWD ret=0 > > (( OPTIND = 1 )) > > while getopts "l" opt > do > case $opt in > (l) last=1 > ;; > esac > done > > shift $(( OPTIND - 1 )) > file=$1 > > if [[ $last ]] # last link > then > while true > do > dir=$(dirname "$file") What if $file starts with "-"? > [[ ! -d $dir ]] && > { > ret=1 > break > } > > # remove slashes at end > > while [[ $file == */ ]] > do > file=${file%/} > done What if $file is "/" > file=${file##*/}# file name > command cd -P "$dir" What if that command fails? > [[ ! -h $file ]] && break What if there are symlinks in the path components? > link=$(command ls -l -- "$file"; printf x) Good point about command substitution discarding trailing newlines. I forgot that in the solution I gave. > link=${link%$'\nx'} > remove="$file -> " > file=${link#*"$remove"} > done > > printf "%s\n" "$PWD/$file" > command cd $oldPWD What if $oldPWD has blanks or wildcards? > elif [[ ! -h $file ]] # 1st link > then > printf "%s\n" "$file" > else > link=$(ls -l "$file") > printf "%s\n" "${link##*-> }" What about trailing newlines here and links with "-> " in their path. > fi > > return $ret What if $IFS contains 0 or 1? > } > > declare -fc getlink What's the -c about? -- Stephane
Re: equivalent of Linux readlink -f in pure bash?
2011-08-10, 12:00(+02), Bernd Eggink: [...] > function f > { > local OPTIND=1 > > echo "\$1=$1" > } > > while getopts "abcdefg" opt > do > echo "opt=$opt" > f $opt > done > > > Calling the sript like this works fine: > script -a -b -c > > But calling it like this leads to an endless loop: > script -abc [...] > However, this also loops endlessly. The reason is most likely that bash > maintains an additional internal variable holding the index of the > current character, relative to the current word. While this variable is > not directly accessible by the user, it is set to 0 whenever OPTIND is > assigned a value. [...] That would be a bug in bash in my opinion. If OPTIND is marked local to the function, it shouldn't affect the behavior of parent contexts. Note that that bug is also in ksh93, pdksh, mksh and posh (though slightly different in that one), but not in ash nor zsh. Note that if you set OPTIND local, you probably want to do the same for OPTARG (and maybe OPTERR). -- Stephane
Re: equivalent of Linux readlink -f in pure bash?
2011-08-09, 09:50(-04), Steven W. Orr: [...] > *) To remove the trailing slashes, instead of > > while [[ $file == */ ]] > do > file=${file%/} > done > > file=${file##*/}# file name > > just say > file="${file%${file##*[!/]}}" [...] file=${file%"${file##*[!/]}"} Same problem with "/" being changed to "" though. -- Stephane
Re: How to match regex in bash? (any character)
2011-09-29, 13:52(-08), Roger: [...] > Since you're saying the regex description is found within either regex(3) or > regex(7), couldn't there be a brief note within the Bash Manual Page be > something > to the effect: [...] No, it's not. I suppose bash could say: See your system regex(3) implementation documentation for the description of extended regular expression syntax on your system. That syntax should be compatible with one version or the other of the POSIX Extended Regular Expression syntax whose specification for the latest version as of writing can be found at: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04 regex(3) points to the API (regex.h), how the system documents the regexps covered by that API is beyond bash knowledge. -- Stephane
gnu.bash.bug usenet interface not working again
It worked for a while after I reported the issue and then it stopped again. That's very frustrating. -- Stephane
Re: initialisation bash variables
2011-08-15, 17:15(+02), Francky Leyn: > Hello, > > if you have a variable, say VAR, > and you don't assign it a value, > and afterwards you test it, > what is the value of $VAR then? > > random, or an empty string? [...] Upon startup, the shell makes one shell variable per environment variable whose name is compatible with shell variable names. So for instance, if bash receives "VAR=foo" in its environemt, $VAR will expand to foo. If it's passed "1=bar", $1 will not be affected, and it's the same for a few special variables of the shell. If passed "A+B=C" or "=D" for instance, that obviously won't be mapped to shell variables. Some shells do discard variables from the environment that can't be mapped to shell variables. That's not the case of bash. -- Stephane
Re: initialisation bash variables
2011-08-16, 22:24(+02), Francky Leyn: [...] > VAR=FALSE > # some command line procesing, that can set VAR to "TRUE" > if [ $VAR = TRUE ]; then > ... > fi > > Must I effectively write that VAR=FALSE? > Or will the script work fine without? Yes, you must write it, because bash may inherit a VAR variable from the environment like I said (especially when you consider that all uppercase variables are by convention reserved for environment variables). > Also, can't I write the test as > > if [ $VAR ]; then > ... > fi [...] No. That syntax is wrong. Valid syntaxes are: if [ "$VAR" != "" ] if [ -n "$VAR" ] if [ "$VAR" ] Or if you want to be extremely portable: if [ "" != "$VAR" ] or if [ "x$VAR" != x ] Personally, I prefer: var=false if ... var=true ... if "$var"; then ... fi -- Stephane
Re: initialisation bash variables
2011-08-17, 08:32(+02), Francky Leyn: > On 8/16/2011 10:53 PM, Stephane CHAZELAS wrote: >> 2011-08-16, 22:24(+02), Francky Leyn: >> [...] >>> VAR=FALSE >>> # some command line procesing, that can set VAR to "TRUE" >>> if [ $VAR = TRUE ]; then >>> ... >>> fi >>> >>> Must I effectively write that VAR=FALSE? >>> Or will the script work fine without? >> >> Yes, you must write it, because bash may inherit a VAR variable >> from the environment like I said (especially when you consider >> that all uppercase variables are by convention reserved for >> environment variables). > > 1) So it's a bad idea to use uppercase variables in a script? Yes, unless you want to export them to the environment of commands you start in that script. > 2) If VAR coincides with an environment variable, and in the > script I change it value, is this then propagated to outside > the script? Is the environment variable affected? The environment is a list of strings (by convention of the format var=value) passed upon executing a command in a fashion exactly similar to the list of arguments to the command. In other words, when you execute a command: cmd arg1 arg2 you pass both a list of arguments ("cmd", "arg1", and "arg2") and a list of environment variables ("PATH=/bin...", "VAR=foo"...). The difference being that the list of arguments is explicit on the shell command line while the list of environment variables comes from the remembered list of environment variables that the shell (or any application that uses the C library and *environ, putenv(3), setenv(3)) maintains internally (and initialised from the environment it received when it was executed). Though the shell allows the syntax: VAR=value cmd arg1 arg2 to specify environment variables on the command line. In no circumstance are variable definitions in one process going to affect the environment of other processes (an exception to that is the "fish" shell) -- Stephane
Re: initialisation bash variables
2011-08-18, 04:10(+02), Patrick: > On 17.08.2011 20:53, Stephane CHAZELAS wrote: >> 2011-08-17, 08:32(+02), Francky Leyn: >>> On 8/16/2011 10:53 PM, Stephane CHAZELAS wrote: >>> 2) If VAR coincides with an environment variable, and in the >>> script I change it value, is this then propagated to outside >>> the script? Is the environment variable affected? >> >> The environment is a list of strings (by convention of the format >> [...] >> >> In no circumstance are variable definitions in one process going >> to affect the environment of other processes (an exception to >> that is the "fish" shell) >> > > Could it be that the two of you are not talking about the same thing? > > Just for clarity: environment variables (henceforth "the environment") > of a process are (is) inherited by its children. Everything is inherited by children, however upon executing a command using the execve(2) system call, all the memory is of a process is reinitialised. What the new command gets passed along the execve(2) system call (as arguments) is a list of arguments (argv) and a list of environment variables (envp). And as I said, by convention (and helped in that way by C library function wrappers around the execve(2) system call (execv, execl, system...) that take care of propagating the environment) that envp is built by the application from the envp it received when it was executed (in that process or its parents). So yes, generally, environment generally is inherited by commands executed in children processes but also by the current process. > Therefore, what *does* happen, is that if Stephane, as in 2), changes > VAR in script, the change gets propagated to the scripts *child* processes. I think it brings confusion to speak of processes here. Everything is propagated upon a fork() (the system call that creates a child process), a fork creates an exact same copy of the current process. The environment is something that concerns command execution. As a side note though, that behavior didn't occur in the Bourne shell. In the bourne shell, you had to explicitely export a variable (even if the shell received itself in its environment), for it to be exported to the commands executed by the shell. $ VAR=foo sh -c 'VAR=bar; env' | grep VAR VAR=foo $ VAR=foo sh -c 'VAR=bar; export VAR; env' | grep VAR VAR=bar [...] > But what does of course not happen, is that the change would get > propagated to the *parent* process. Or any other process. environment changes are propagated to children just like the rest of the memory and generally to commands executed by the current process or any of those children > (What is the "fish" shell ???) The friendly interactive shell. http://en.wikipedia.org/wiki/Friendly_interactive_shell And see http://fishshell.com/user_doc/index.html#variables for the documentation on the scope of its variables. -- Stephane
Re: Syntax Question...
2011-08-14, 02:43(+00), Michael Witten: [...] >> Please read BashFAQ/006: http://mywiki.wooledge.org/BashFAQ/006 > > "no force in the universe can put NUL bytes into shell strings usefully" > > Ain't that the goddamn Truth! No, zsh supports NUL bytes in its strings happily. It's even in the default $IFS. Where NUL bytes can't go, it's in arguments to commmands, environment variables, filenames... But I can't see why a shell variable couldn't contain NUL bytes, it's even a good thing for both those reasons as you can use that character to safely separate filenames, arguments, env vars... See for instance the -0 option of many GNU utilities. -- Stephane
Re: initialisation bash variables
2011-08-18, 12:44(+02), Patrick: [...] >> $ VAR=foo sh -c 'VAR=bar; env' | grep VAR >> VAR=foo >> $ VAR=foo sh -c 'VAR=bar; export VAR; env' | grep VAR >> VAR=bar > Interresting! I do not have the bourne shell installed. Bash tries to > mimic it when called as "sh", but it does not produce the "correct" > result for your first example. Not that I would mind about that though. > Busybox btw. also yields "VAR=bar". Phew... Dash as well. There is no > "sh" package in the ubuntu repos. Google also has no quick answer. What > kind of "sh" are you actually using if I may ask ? Yes, that was the behavior of the Bourne shell, it was changed by the Korn shell and every other Bourne-like shell followed, and the new behavior is now specified by POSIX. The Bourne shell is a shell written by Steve Bourne in the late seventies and is the ancestor of all of nowadays "Bourne-like shells" (ash, dash, ksh88, ksh93, pdksh, posh, mksh, bash, zsh...). There have been many variants of the Bourne shell with modifications added by the various Unix vendors. It's still found for backward compatibility in some commercial Unices. Nowaday, "sh" refers to an implementation or another of a shell that is able to interpret a POSIX script as specified (just to avoid saying a "POSIX shell" which would be too much of a shortcut). The Bourne shell is not one of them (for the reason above and many others). The code of the Bourne shell was released as opensource /recently/ as part of OpenSolaris, so you can now find ports of it to Linux (See heirloom-sh for instance). You can also run the Bourne shell from UnixV7 (the OS where it was first released in 1979) in a PDP11 emulator, and you'll notice a few differences between the two. See http://www.in-ulm.de/~mascheck/bourne/ for a reference on the Bourne shell. -- Stephane
Re: Syntax Question...
2011-08-17, 08:24(-04), Greg Wooledge: > On Tue, Aug 16, 2011 at 03:41:19PM -0700, Linda Walsh wrote: >> Ken Irving wrote: >> >Maybe this? >> >today_snaps=( ${snap_prefix} ) > >> but as you mention, that will put them into an arraysorry "imprecise >> terminology" list for me is some number of objects in a string >> separated by some >> separator. > > This is an extremely bad idea. Legacy Bourne shell code from the > 1980s kind of bad -- from the horrible days before we *had* arrays > in shells. How are you going to handle filenames with spaces in them? > With newlines in them? With commas in them? With colons in them? Tabs? > DEL characters? Those are all valid in filenames. Any delimiter you > can *put* in a shell string is also a valid character in a filename (or > at least in a pathname, which eliminates the possibility of using slash). > In this code: today_snaps=( ${snap_prefix} ) With the default value of IFS in bash and without globbing disabled, the problematic characters are SPC, TAB, NL, *, ?, [ and potentially more if you have extended globbing enabled. If $snap_prefix is meant to be space delimited, then you can make it a bit safer by doing: IFS=" " set -f today_snaps=( $snap_prefix ) NL is a good delimited because it's rare in filenames (but are allowed, so if the data is foreign and security is a concern, not an option) and you can also pass the list to line-based (text_ utilities var='a1 a2 b2' IFS=' ' set -f set -- $var Or a_vars=$(printf '%s\n' "$var" | grep '^a') -- Stephane
[OT] Re: accents
2011-08-25, 12:19(-07), Linda Walsh: [...] > ` Greg Wooledge wrote: >> On Wed, Aug 24, 2011 at 06:51:32PM -0700, Linda Walsh wrote: >> >>> BTW, Thomas -- what is the Character that comes after 'De' in your >>> name? I read it as hex '0xc282c2' which doesn't seem to be valid unicode. >>> >> >> RFC 2822 (section 2.2) says that Header Fields in an email must be >> composed of US-ASCII characters, so there's no telling what sort of >> problems the multi-byte character in his From: header may be triggering >> as it passes through various mail transfer agents. >> > Well, on one level, I would agree, > But on another, RFC 2822 is obviously messed up, since domain names can > contain UTF-8 characters.. > > > So...um...how does that work? [...] See RFC 5335 -- Stephane
Re: Using TMOUT as default for read bultin is unwise
2011-09-14, 09:46(+01), Wheatley, Martin R: [...] > Description: > The contents of the environment variable TMOUT are used are the > default timeout for the read builtin command when reading from > a TTY or pipe AND as a shell command input idle time. > > This can lead to random failures of shell scripts [...] > I think the TMOUT should not be overloaded and its use as a default > value > for the read builtin - especially for sub-shell pipelines is dangerous > and should be discontinued otherwise all bash scripts that use the read > builtin > need to be modified to include TMOUT=0. That's not the only problematic variable. See also http://groups.google.com/group/comp.unix.shell/browse_thread/thread/cf7d5147dd829cf9/ef5b5b49a676b99d#ef5b5b49a676b99d And here is what Geoff Glare from the Austin Group (the body behind POSIX) had to say when I raised it some time ago: http://groups.google.com/group/comp.unix.shell/browse_thread/thread/60c3e67919c36d0a/25ab970d275ecdb7#25ab970d275ecdb7 In short: if one sets a TMOUT environment variable, the easy fix is to tell them: DON'T! -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-16, 17:17(-07), William Park: > 145557834293068928043467566190278008218249525830565939618481 > is awfully big number! :-) 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you can represent with 64bits, nor any reasonable number of bits, not (3**2)**62. Certainly not a number that bash arithmetic expansion can handle not even in floating mode. Wih zsh: $ echo $((exp((2**62)*log(3 inf. $ echo 'e((2^62)*l(3))' | bc -l Runtime warning (func=e, adr=123): scale too large, set to 2147483647 Fatal error: Out of memory for malloc. -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-17, 13:06(+00), Stephane CHAZELAS: > 2011-09-16, 17:17(-07), William Park: >> 145557834293068928043467566190278008218249525830565939618481 >> is awfully big number! :-) > > 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you > can represent with 64bits, nor any reasonable number of bits, > not (3**2)**62. [...] Sorry, my bad, 3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to most other places (ksh93, bc, python, gawk, perl, ruby...). -- Stephane
Re: How to match regex in bash? (any character)
2011-10-1, 14:39(-08), rogerx@gmail.com: [...] > I took some time to examine the three regex references: > > 1) > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04 > Written more like a technical specification of regex. Great if your're > going to be modifying the regex code. Difficult to follow if you're new, > looking for info. One thing to bear in mind is that bash calls a system library to perform the regexp expansion (except that [*]), so it can't really document how it's gonna work because it just can't know, it may differ from system to system. The only thing that is more or less guaranteed is that all those various implementation should comply to that specification. Above is the specification of the POSIX extended regular expression, so a bash script writer should refer to that document if he want to write a script for all the systems where bash might be used. > 2) regex(7) > Although it looks good, upon further examination, I start to see run-on > sentences. It's more like a reference, which is what a man file should > be. > At the bottom, "AUTHOR - This page was taken from Henry Spencer's regex > package" On the few systems where that man page is available, it may or may not document the extended regular expressions that are used when calling the regex(3) API (on my system, it doesn't). Those regular expressions may or may not have extensions over the POSIX API, and that document may or may not point out which ones are extensions and which one are not, so a script writer may be able to refer to that document if he wants his script to work on that particular system (except that [*]). > 3) grep(1) > Section "REGULAR EXPRESSIONS". At about half the size of regex(7), the > section clearly explains regex and seems to be easily understandable for a > person new to regex. That's another utility that may or may not use the same API, in the same way as bash or not. You get no warranty whatsoever that the regexps covered there will be the same as bash's. [*] actually, bash does some (undocumented) preprocessing on the regexps, so even the regex(3) reference is misleading here. For instance, on my system the regex(3) Extended REs support \1 for backreference, \b for word boundary, but when calling [[ aa =~ (.)\1 ]], bash changes it to [[ aa =~ (.)1 ]] (note that (.)\1 is not a portable regex as the behavior is unspecified) bash won't behave as regex(3) documenta on my system. Also (and that could be considered a bug), "[\a]" is meant to match either "\" or "a", but in bash, because of that preprocessing, it doesn't: $ bash -c '[[ "\\" =~ [\a] ]]' || echo no no $ bash -c '[[ "\\" =~ [\^] ]]' && echo yes yes Once that bug is fixed, bash should probably refer to POSIX EREs (since its preprocessing would disable any extension introduced by system libraries) rather than regex(3), as that would be more accurate. The situation with zsh: - it uses the same API as bash (unless the RE_MATCH_PCRE option is set in which case it uses PCRE regexps) - it doesn't do the same preprocessing as bash because... - it doesn't implement that confusing business inherited from ksh whereby quotes RE characters are taken literally. So, in zsh - [[ aa =~ '(.)\1' ]] works as documented in regex(3) on my system (but may work differently on other systems as the behavior is unspecified as per POSIX). - [[ '\' =~ '[\a]' ]] works as POSIX specifies - after "setopt RE_MATCH_PCRE", one gets a more portable behavior as there is only one PCRE library (thouh different versions). The situation with ksh93: - Not POSIX either but a bit more consistent: $ ksh -c '[[ "\\" =~ [\a] ]]' || echo no no $ ksh -c '[[ "\\" =~ [\^] ]]' || echo no no - it implements its own regexps with its own many extensions which therefore can be and are documented in its man page but are not common to any other regex (though are mostly a superset of the POSIX ERE). -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-19, 09:27(-04), Chet Ramey: > On 9/16/11 4:39 PM, Nicolas ARGYROU wrote: > >> Bash Version: 4.0 >> Patch Level: 33 >> Release Status: release >> >> Description: >> The algorithm used to calculate x to the power of y: x**y >> takes O(y) time which is way too long on systems using 64 bits. >> Calculating for exemple $((3**2**62)) freezes the shell at >> argument parsing time. >> >> Repeat-By: >> bash -c 'echo $((3**2**62))' >> >> Fix: >> This fix uses an alorithm that takes O(log(y)) time, which is way >> faster. But it is still about 30 times slower with random numbers >> than a single multiplication, on 64 bits systems. The fix is written >> as a C++ template working on any unsigned integer type, and doesn't >> need any external resource: > > Thanks for the report. This looks like an independent reimplementation of > the "exponentiation by squaring" method. I did a little looking around, > and it's the best algorithm out there. I used a slightly different but > equivalent implementation. [...] FYI, ksh93 uses pow(3). So does zsh, but only in floating point mode. Probably better and more efficient than reinventing the wheel. -- Stephane
Re: List of background processes in a command group, in a pipeline, executed sequentially under certain conditions.
2011-10-01, 06:54(-05), Dan Douglas: [...] > f() { > local -i x y > while read -rN1 "x[y++]"; do > printf '%d ' "${1}" >&2# keep track of which job this is. > done > printf "${#x[@]} " # Print the total number of reads by each > job. if you add a echo >&2 "[done $1]" here. > } > > g() { # Used in ex 6 > f 1 <${1} & > f 2 <${1} > } > > # This works as I expect, f is backgrounded and two readers of one pipe each > get about half the input: > exincr # 1 > > read -ra x < <({ f 1 & f 2; } < <(zeros)) > printf '%b\n' "\n${x[@]}\n" > > # Equivalent to above, except with piped output. Now f is not backgrounded. > One reader consumes all the input: > exincr # 2 > > { f 1 & f 2; } < <(zeros) | { You'll notice that f 1 terminates straight away. And if you do a strace, you'll notice that bash does a dup2(open("/dev/null"), 0), that is redirecting "f 1"'s stdin to /dev/null. ~$ bash -c '{ cat; } < c | cat' test ~$ bash -c '{ cat & } < c | cat' ~$ bash -c '{ lsof -ac lsof -d0; } < c | cat' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lsof5005 chazelas0r REG 253,25 58785638 /home/chazelas/c ~$ bash -c '{ lsof -ac lsof -d0 & } < c | cat' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lsof5010 chazelas0r CHR1,3 0t0 973 /dev/null That behavior is required by POSIX and occurs for ash and pdksh and its derivatives as well: POSIX> command1 & [command2 & ... ] POSIX> POSIX> The standard input for an asynchronous list, before any POSIX> explicit redirections are performed, shall be considered to POSIX> be assigned to a file that has the same properties as POSIX> /dev/null. If it is an interactive shell, this need not POSIX> happen. In all cases, explicit redirection of standard input POSIX> shall override this activity. However, I don't know why bash does it only in the "pipe" case. ~$ ash -c '{ lsof -ac lsof -d0 & } < c' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lsof5188 chazelas0r CHR1,3 0t0 973 /dev/null ~$ bash -c '{ lsof -ac lsof -d0 & } < c' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lsof5191 chazelas0r REG 253,25 58785638 /home/chazelas/c To work around, this <&0 trick seems to work: ~$ bash -c '{ lsof -ac lsof -d0 <&0 & } < c | cat' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lsof5247 chazelas0r REG 253,25 58785638 /home/chazelas/c -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-17, 13:39(+00), Stephane CHAZELAS: > 2011-09-17, 13:06(+00), Stephane CHAZELAS: >> 2011-09-16, 17:17(-07), William Park: >>> 145557834293068928043467566190278008218249525830565939618481 >>> is awfully big number! :-) >> >> 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you >> can represent with 64bits, nor any reasonable number of bits, >> not (3**2)**62. > [...] > > Sorry, my bad, > > 3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to > most other places (ksh93, bc, python, gawk, perl, ruby...). Sorry again, I was right in the first place, 3**2**62 is 3**(2**62) in bash and zsh like in other shells. I think I need more sleep... -- Stephane
Re: How to match regex in bash? (any character)
2011-10-02, 21:51(-04), Chet Ramey: > On 10/2/11 3:43 PM, Stephane CHAZELAS wrote: > >> [*] actually, bash does some (undocumented) preprocessing on the >> regexps, so even the regex(3) reference is misleading here. > > Not really. The words are documented to undergo quote removal, so > they undergo quote removal. That turns \1 into 1, for instance. [...] The problem and confusion here comes from the fact that "\" is overloaded and used by two different pieces of software (bash and the system regex). It is used: - by bash for quoting - by regex(3) to escape regexp characters in some circumstances (for instance when not inside [...], but it may vary per implementations (think of the (?{...} type extensions)) - by some regex(3) implementations to introduce new regexp operators (\w, \b, \<...) BTW, another bug: $ bash -c '[[ "\\" =~ ["."] ]]' && echo yes yes And what one could consider a bug: ~$ bash -c 'chars="a]"; [[ "a" =~ ["$chars"] ]]' && echo yes ~$ bash -c 'chars="a]"; [[ "a]" =~ ["$chars"] ]]' && echo yes yes I was wrong in saying that bash documentation should refer to POSIX regexps as it disables extensions. It only disables extensions introduced by "\", not the ones introduced by sequences that would otherwise be invalid in POSIX EREs like "(?", {{, **... It should still refer to POSIX regexps as it's the only ones guaranteed to work. Any extension provided by the system's regex(3) API may not work with bash. -- Stephane
Re: How to match regex in bash? (any character)
2011-10-03, 13:48(+02), Andreas Schwab: > Stephane CHAZELAS writes: > >> The problem and confusion here comes from the fact that "\" is >> overloaded and used by two different pieces of software (bash >> and the system regex). > > That's nothing new. The backslash is widely used as a quote character > in several languages, which requires two levels of quoting if one of > these languages is embedded in another one. [...] Yes, but in this case, contrary to zsh doesn't do two levels of quoting. Bash quoting means to escape the RE operators, and that's where the problem comes from. For it to work fully, bash would need to implement the full RE parsing to know where to put backslashes when characters are quoted. Bash turns: "." to \. before calling the regex(3) API '[.]' to \[\.\] (fine) ['.'] to [\.] (not fine) ['a]'] to [a\]] (not fine) (.)\1 to (.)1 (fine or not fine depending on how you want to look at it) (?i:test} to (?i:test) (assuming regex(3) are implemented with PCREs: fine or not fine depending on how you want to look at it). In zsh, it's simpler as quoting just quotes shell characters, it doesn't try to escape regexp operators. -- Stephane
Re: Syntax Question...
2011-10-3, 08:36(-04), Greg Wooledge: > On Thu, Aug 18, 2011 at 11:10:17AM +0100, Stephane CHAZELAS wrote: >> >> Please read BashFAQ/006: http://mywiki.wooledge.org/BashFAQ/006 >> > "no force in the universe can put NUL bytes into shell strings usefully" > >> No, zsh supports NUL bytes in its strings happily. It's even in >> the default $IFS. Where NUL bytes can't go, it's in arguments >> to commmands, environment variables, filenames... But I can't >> see why a shell variable couldn't contain NUL bytes, it's even a >> good thing for both those reasons as you can use that character >> to safely separate filenames, arguments, env vars... See for >> instance the -0 option of many GNU utilities. > > zsh doesn't count, because it's not compatible with any other shell. If zsh is not compatible with any other shell, then bash isn't either. What do you mean? > Since this is a bash mailing list (or newsgroup), we're concerned with > the behavior of bash, and bash can't store NUL bytes in strings. The > reason for this is simple enough -- bash uses native C strings for > its variables. [...] I was replying to the "no force in the universe can put NUL bytes into shell strings usefully" statement, pointout out that at least one shell did (and is still POSIX compliant in doing so) and it was very useful. bash code is not fixed in stone, it could be changed to allow NUL bytes. It could even reuse zsh code since zsh is provided under BSD-style license. -- Stephane
[OT] Re: Syntax Question...
2011-10-3, 09:12(-04), Greg Wooledge: > On Mon, Oct 03, 2011 at 12:56:43PM +0000, Stephane CHAZELAS wrote: >> If zsh is not compatible with any other shell, then bash isn't >> either. What do you mean? > > http://zsh.sourceforge.net/FAQ/zshfaq02.html#l10 > > "because of all the options it is not safe to assume a general zsh run > by a user will behave as if sh or ksh compatible" Not really the best place to discuss zsh but as you insist... :) > Actually, this is a trend I wasn't previously aware of: > > "invoking zsh as sh or ksh (or if either is a symbolic link to zsh) sets > appropriate options and improves compatibility (from within zsh itself, > calling ARGV0=sh zsh will also work)" Strange that you would say that. zsh is about the same age as bash and has always been a few steps ahead of bash in terms of interactive use functionality. It probably dethroned tcsh as best interactive shells around the mid-90s (I started using it around 97). bash and ksh93 both borrowed features from zsh. It is more compatible with ksh than bash is (implements many of ksh builtins like typeset, print, supports floating point arithmetics, recursive globbing (actually, ksh borrowed that from zsh)) > "The classic difference is word splitting, discussed in question 3.1; > this catches out very many beginning zsh users. As explained there, this > is actually a bug in every other shell." > > Um, no. Word splitting is well defined, and changing how your shell > handles it and declaring it "a bug in every other shell" is unacceptable. [...] The bug is the "implicit word splitting and filename generation done upon variable expansion in list contexts". You may not want to call it a bug, but I don't think anyone here wouldn't agree to at least call it a "very unfortunate misfeature responsible for 80% of shell script bugs, 90% of exploitable shell script vulnerabilities, and a lot of confusions in beginners" While that feature made sense in the Bourne shell that didn't have arrays, it's just a burden in all modern shells. That is why several times a week you'll see comments on comp.unix.shell such as: "don't leave a variable unquoted unless you have a very good reason to (and know what you're doing (and taken appropriate action to limit its side-effects))" I'm not saying zsh is perfect, nor would I recommend it to interpret POSIX scripts. There are misfeatures of Bourne/ksh it didn't fix like removal of arguments that expand to an empty string, command substitution that removes *every* trailing newline character, some would argue it suffers a bit from "feature creeping". I don't write zsh script, not any more than I write bash or ksh scripts. I don't care what shell interprets the scripts I write (the main reason why I follow gnu.bash.bug is for the discussion about portability) though I would rather have them interpreted by the leaner and faster ash or pdksh derivatives. But zsh cleaner and safer yntax, amazing completion system, and globbing are amongst the main reasons why I wouldn't use any other shell interactively. If you want a shell with a very clean syntax, you can have a look at plan9 "rc" shell (that has been ported to Unix). -- Stephane
Re: Error in manual for >&word redirection
2011-10-12, 14:39(-06), Eric Blake: > On 10/12/2011 02:07 PM, Greg Wooledge wrote: >> Even using a space is not sufficient to force a valid file descriptor number >> to be treated as a filename: >> >> imadev:~$ foo>& 1 >> stdout >> stderr >> imadev:~$ ls -l 1 >> 1 not found > > If you want 'word' treated as a filename, then express it as a filename. > It's still possible to write to a file named '1': > > foo >&./1 Or foo >&! 1 or foo &> 1 or foo > 1 2>&1 -- Stephane
Re: How to automatically load alias from .bashrc in a bash script?
2011-10-27, 11:17(-05), Peng Yu: > I want to use some alias in a bash script. But I have to include the > following line in the script in order to use the alias defined in > ~/.bashrc. Is there a way to automatically load the alias from .bashrc > so that I don't have to explicitly include these two lines? > > shopt -s expand_aliases > . ~/.bashrc You mean you'd like every bash script you run on your system to automatically source your ~/.bashrc. That sounds like a very unwise thing to do to me, but that could be done with: export BASH_ENV="$HOME/.bashrc" -- Stephane > >
Re: {varname} for redirection does not work with arrays
2011-11-2, 12:01(-06), unkn...@vmw-les.eng.vmware.com: [...] > Description: > If {varname} is an assoc. array in a redirection the exec will fail > The [] should not be confused with pathname expansion just like ${}. > > Repeat-By: > $ declare -A array > $ exec {array[key]}
Re: What is the best to pass an array with specially characters as command line arguments?
2011-11-6, 21:02(-06), Peng Yu: [...] > #!/usr/bin/env bash > > verbatim_string=" a b c ( a'b | " > > args="`echo \"$verbatim_string\" | sed -f quoteverb.sed`" > > cmd="echo $args" > eval "$cmd" > > ~$ ./main.sh > a b c ( a'b | Either: #! /bin/sh verbatim_string=" a b c ( a'b | " cmd='(set -f; IFS=" "; echo $verbatim_string)' eval "$cmd" ($verbatim_string expanded and split at the time $cmd is evaluated). Or: #! /bin/bash verbatim_string=" a b c ( a'b | " set -f; IFS=' ' cmd="echo $(printf '%q ' $verbatim_string)" eval "$cmd" ($verbatim_string expanded and split and quoted initially). (echo being a poor choice of a command for testing as it concatenates its arguments and possibly handles "\" characters specially under some circumstances). By the way, does bash have any way of introducing local scope for options, similar to zsh's (){ setopt localoptions noglob shwordsplit local IFS=' ' args=($string) } Or to set the options to defaults in a local scope to be able to write code that works whatever the context, like zsh's: f() { emulate -L zsh # ensure default zsh behavior even if caller # has changed some settings ... } I suppose one can add default_settings=$(set +o) at the start of the script, and then f() { local restore_settings restore_settings=$(set +o) eval "$default_settings" ... eval "$restore_settings" } Or is there any more canonical way to do it? -- Stephane
Re: feature request: printf %(%s)T
2011-11-23, 12:00(-05), Chet Ramey: > On 11/22/11 4:53 PM, Greg Wooledge wrote: >> This is a feature request, rather than a bug. Bash 4.2's printf command >> has a lovely %(datefmt)T feature that allows it to print out formatted >> timestamps using the underlying operating system's strftime(3) routine. >> It even allows bash to print the current time, or the time the current >> shell was invoked. > > I wonder if a better way to handle this is to require the %s expansion > at configure time and use the strftime replacement in lib/sh if the C > library's strftime doesn't implement it. What systems, if you know, do > not handle %s? [...] Or just have a special variable for that like zsh's $EPOCHSECONDS. Note that GNU strftime has more extensions than just %s. See also http://stchaz.free.fr/wide_strftime for a POSIX shell implementation of strftime (limited to GMT timezone and POSIX locale though). -- Stephane
Re: bash silently ignores additional arguments for cd
2011-12-07, 12:02(-07), Eric Blake: [...] > Or even nicer, behave like zsh, and treat > cd xxx yyy > like this (written assuming extglob syntax) > cd xxx+($' \t')yyy [...] Not sure what you're refering to, but cd xxx yyy in zsh means cd ${PWD/xxx/yyy} -- Stephane
Re: bash silently ignores additional arguments for cd
2011-12-8, 10:13(+00), Stephane CHAZELAS: > 2011-12-07, 12:02(-07), Eric Blake: > [...] >> Or even nicer, behave like zsh, and treat >> cd xxx yyy >> like this (written assuming extglob syntax) >> cd xxx+($' \t')yyy > [...] > > Not sure what you're refering to, but > > cd xxx yyy > > in zsh means > > cd ${PWD/xxx/yyy} And same in AT&T ksh, pdksh and their derivatives (except posh). -- Stephane
Re: extglob pattern: @(/root) vs. @(root)
2011-12-9, 16:16(+08), Clark J. Wang: > See following: > > # shopt extglob > extglob on > # echo $BASH_VERSION > 4.2.20(1)-release > # ls -d /root > /root > # pwd > / > # echo @(root) > root > # echo @(/root) > @(/root) <-- ??? > # echo @(/root*) > @(/root*) <-- ??? > # > > I'm confused why @(/root) and @(/root*) do not work here. Globbing operators (*, ?, [/], @(..)) don't match "/". "/" has to be inserted literally. See the doc: When a pattern is used for filename expansion, the character `.' at the start of a filename or immediately following a slash must be matched explicitly, unless the shell option `dotglob' is set. When matching a file name, the slash character must always be matched explicitly. In other cases, the `.' character is not treated specially. ~~ -- Stephane >
help-bash mailing list (Was: lseek with bash)
2011-12-9, 10:27(-05), Greg Wooledge: [...] > This probably belongs on help-bash rather than bug-bash, although it's a > grey area. I'm Cc-ing both. [...] First time I hear about a "help-bash" mailing list. Is that new? I saw no annoucement in bash bug (I could have missed it though), nor does it seem to be referenced on the bash homepage at gnu.org. And there doesn't seem to be a corresponding usenet newsgroup like for the other gnu.org mailing lists. And it doesn't seem it's been added to gmane (yet?). Was it announced anywhere? -- Stephane
Re: return values of bash scripts
2011-12-20, 14:18(-08), kc123: [...] > Could someone please help me figure out why: > GNU bash, version 3.2.25(1)-release-(x86_64-redhat-linux-gnu) [...] Since you're using RedHat Linux, try ps -o comm= -C crond -- Stephane
Re: let's set the positional parameters with a simple 'read'
2012-01-04, 01:46(+08), jida...@jidanni.org: >> "GW" == Greg Wooledge writes: >GW> Why not just use a named array? >GW> $ read -a myarray > But does that let me get a my favorite array, the positional parameters? FWIW, in zsh: ~$ read -A argv a b c ~$ echo $1 a ~$ read 1 x ~$ echo $1 x See also vared argv and vared 1 to edit them with the zsh line editor (zle, the bash equivalent would be readline as invoked by read -e) -- Stephane
Re: I think I may have found a possible dos attack vector within bash.
2012-03-20 16:47:51 +, Eamonn Smyth: > Without sounding alarmist, I can break my machine using bash. I also have a > fix. I shall be officially releasing the c code this weekend at the > hackathon london. [...] A DOS vector often found is bash *scripts* is when a script takes user input in a variable and that variable is left unquoted upon expansion in list contexts. For instance this CGI script which looks harmless at first glance: #! /bin/bash - q=$QUERYSTRING printf '%s\r\n' "Content-type: text/plain" "" if echo $q | grep ...; then echo "Invalid input" fi Try with values of $QUERYSTRING like /*/*/*/../../../*/*/*/../../../*/*/*/../../../*/*/* (beware it might cause your computer to crash) -- Stephane
Re: bash does filename expansion when assigning to array member in compound form
2012-08-18 10:26:22 -0500, Dan Douglas: > This is a feature that all shells with this style of compound assignment have > in common. If no explicit subscripts are given, the text between the > parentheses is processed exactly as though it were arguments to a command > including brace expansion, word-splitting, and pathname expansion (and > consequently, quoting is just as important). This is an important feature > because it allows storing the results of a glob in an array easily. > > If a subscript is given explicitly, then the right-hand side of the > assignment > is treated exactly as an ordinary scalar assignment would be, including all > analagous behaviors for `+=' and the integer attribute. > > $ set -x; a=( [1]=* ) > + a=([1]=*) [...] Nope: ~/1$ touch '[1]=x' ~/1$ bash -c 'a=( [1]=* ); echo "${a[@]}"' [1]=x ~/1$ bash -c 'a=( [1]=asd ); echo "${a[@]}"' asd That's a bug though. Just do a=("*") or a=('*') or a=(\*) -- Stephane
Re: Bash 4.1 doesn't behave as I think it should: arrays and the environment
2012-08-17 17:18:12 -0400, Greg Wooledge: > On Fri, Aug 17, 2012 at 03:19:56PM +0800, John Summerfield wrote: > > In two cases I wish to pass an array in the environment, like so: > > 14:28 john@Boomer$ STUFF[1]=one STUFFX=stuffx env | grep ^ST > > STUFFX=stuffx > > STUFF[1]=one > > You are creating an environment variable called "STUFF[1]". This is > an invalid environment variable name, but env(1) receives it and > dutifully prints it out. However, Bash recognizes that it is an > invalid name, and does not turn it into a usable shell variable. [...] Well, if it were an invalid environment variable, why would bash pass it to env? STUFF[1] is a valid environment variable name, but not a valid shell variable name. > Some older versions of Bash would also strip such invalid variables > from the environment before invoking child processes. Bash 4.2 > leaves them in the environment because of complaints about build > systems that were using them for some purpose. I don't know whether > Bash 4.1 is one that stripped them or preserved them. In either > case, you should not be writing programs that rely on invalid variable > names. [...] It's different here. It's the shell that *sets* the STUFF[1] env var passed to the env command, it's not about inheriting it. Note that no other shell would put "STUFF[1]=on" in env's environ there: $ zsh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF STUFFX=stuffx $ ksh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF STUFFX=stuffx $ pdksh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF STUFFX=stuffx $ ash -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF ash: 1: STUFF[1]=one: not found $ bourne-sh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF bourne-sh: STUFF[1]=one: not found ~$ ksh -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF STUFF=one STUFFX=stuffx ~$ pdksh -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF STUFF=one STUFFX=stuffx ~$ bash -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF STUFF[0]=one STUFFX=stuffx (in ksh, $V is short for ${V[0]}). -- Stephane
Re: bash does filename expansion when assigning to array member in compound form
2012-08-20 19:44:51 +0200, Roman Rakus: [...] > And how would you achieve to fill array with all file names > containing `[1]=' for example. [...] Another interesting question is how to fill the array with all the file names that start with a digit followed by "=". $ touch {3..5}=foo $ ls 3=foo 4=foo 5=foo $ bash -c 'a=([0-9]=*); typeset -p a' bash: [0-9]=*: bad array subscript declare -a a='()' $ bash -c 'shopt -s extglob; a=(@([0-9])=*); typeset -p a' bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `shopt -s extglob; a=(@([0-9])=*); typeset -p a' $ bash -c 'shopt -s extglob a=(@([0-9])=*); typeset -p a' declare -a a='([0]="3=foo" [1]="4=foo" [2]="5=foo")' > Definitely it's good, if you want to be sure, to always quote all > characters which means pathname expansion - `*', `?' and `['. [...] Yes, the problem here is that "[" is overloaded in a conflicting manner as a globbing operator and that poorly designed special type of array assignment. Quoting them will prevent both, it become more tricky if you want only one or the other. Note that in bash that also means we need to quote variables in there even if IFS is set to "". $ bash -c 'a="*"; b=([1]=$a); typeset -p b' declare -a b='([0]="[1]=bar")' -- Stephane
expansion of \uXXXX in non utf8 locales
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include -I../bash/lib -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall uname output: Linux sc 3.2.0-4-amd64 #1 SMP Debian 3.2.32-1 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 36 (same with 39) Release Status: release Description: There seems to be a bug with the expansion of \u where > 0xFF in non UTF8 locales Repeat-By: $ LC_CTYPE=C bash -c "echo -n $'\u0100'" | od -tx1 000 df df df df df df df df df df df df df df df df 020 df df df df df df df df df 19 032 $ LC_CTYPE=C bash -c "echo -n $'\u0100 '" | od -tx1 malloc: ../bash/parse.y:4515: assertion botched free: underflow detected; mh_nbytes out of range Aborting...000 zsh: abort LC_CTYPE=C bash -c "echo -n $'\u0100 '" | zsh: done od -tx1
moving fd affect outer scope
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include -I../bash/lib -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall uname output: Linux sc 3.7-trunk-amd64 #1 SMP Debian 3.7.3-1~experimental.1 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 37 Release Status: release Moving a file descriptor for a builtin command or a compound command leaves the fd closed afterwards: $ bash -c ': <&1-; echo test' bash: line 0: echo: write error: Bad file descriptor expected output: "test". ksh93 had a similar bug but it was fixed in 1994: 94-06-30 A bug which caused 2<&1- when applied to a shell built-in to leave standard input closed has been fixed. (don't know if it's a typo above (s/input/output/), if not it might be a different bug) In any case, ksh93u+ doesn't exhibit the same behavior as bash: $ ksh -c ': <&1-; echo test' test The longer (standard) form works OK: $ bash -c ': <&1 1<&-; echo test' test -- Stephane
0xFF and read -N
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include -I../bash/lib -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall uname output: Linux sc 3.10-2-amd64 #1 SMP Debian 3.10.5-1 (2013-08-07) x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 45 Release Status: release When called with the -N option, "read" seems to consider the byte 0xFF as meaning EOF (well, not exactly since it doesn't set the exit status to non-zero): OK$ printf '\376' | LC_ALL=C bash -c 'IFS= read -rN1 a; printf %s "$a"' | od -tx1 000 fe 001 NOK$ printf '\377' | LC_ALL=C bash -c 'IFS= read -rN1 a; printf %s "$a"' | od -tx1 000 NOK$ printf 'xx\377xx' | LC_ALL=C bash -c 'IFS= read -rN5 a; printf %s "$a"' | od -tc 000 x x 002 The problem doesn't occur with -n instead. -- Stephane
Re: Following symlinks in globstar
2014-01-21 10:19:10 -0500, Chet Ramey: [...] > > I am not so worried about the method used to "fix" globstar -- whether > > we keep backwards compatibility or not -- I am more concerned that we > > have at least *some* way of disabling what many people view as > > undesirable behaviour. > > It may or may not be undesirable -- we have no data other than your > original message indicating that it is. There have been no proposals, > no code contributions, and nothing other than anecdotes to demonstrate > that "many people" view it as a problem. > > I interpret this lack of discussion as an indication of priority. I do > not consider this a priority, and have not to this point, and I have only > looked at it briefly. [...] I agree with Chris that the bash behaviour here is not desirable. There's a number of articles on usenet or unix.stackexchange.com (many by me sorry, http://unix.stackexchange.com/search?q=bash+globstar+symlink) that say that the bash behaviour is broken in that regard and that globstar from other shells should be used instead. That following symlink behaviour has plagged GNU grep for years until it was fixed in 2.12 (breaking backward compatibility), where -r now does not follow symlinks anymore (but -R still does). The **/ feature comes from zsh (1990), was first copied by ksh93 in 2003 (with differences), by fish in 2005, bash in 2009 and tcsh in 2010. bash's behaviour in many respect is closer to ksh's than zsh's. fish's is the only other implementation that follows symlinks. zsh's, ksh's, tcsh's don't. zsh and tcsh both support ***/* to follow symlinks. Utilities that recurse (chmod -R, chown -R, find, cp -r...) do not follow symlinks (some have options to allow it). When they do, at least they try to detect loops, bash doesn't (fish seems to have some sort of provision for it?) IMO, it's a dangerous misfeature, prone to introducing security vulnerabilities. I beleive most people using globstar are unaware of the problem, and changing the behaviour to not follow symlinks would a lot more likely fix those scripts that use globstar than break them. In over 15 years of using zsh, I've been using ** daily but I can probably count on the fingers of my hands the times I've used *** (certainly a lot less than the times I've been bitten by grep -r following symlinks). Now granted, */* does follow symlinks in every shell. But there's no recursion there, so there are a few issues which it doesn't have. Where bash could improve on zsh here is provide a way to avoid following symlinks in both **/* and */*/*. Cheers, Stephane
Re: let's establish BASH_MINIMUM_TIME_BETWEEN_INTERACTIVE_COMMAND
2014-01-30 12:45:58 +0200, Pierre Gaston: > On Thu, Jan 30, 2014 at 12:37 PM, Dan Jacobson wrote: > > > Thanks fellows but now bash has become very slow to the touch that way. > > > > Maybe try something like: PROMPT_COMMAND='read -t0 && sleep 10' Or PROMPT_COMMAND="perl -MPOSIX -e 'tcflush 0,0'" which should stop you from entering anything other than at the prompt (flushes the terminal input buffer before displaying the prompt). -- Stephane
Re: Following symlinks in globstar
2014-02-04 09:23:21 -0500, Chet Ramey: > On 1/25/14 6:11 PM, Stephane Chazelas wrote: > > 2014-01-21 10:19:10 -0500, Chet Ramey: > > [...] > >>> I am not so worried about the method used to "fix" globstar -- whether > >>> we keep backwards compatibility or not -- I am more concerned that we > >>> have at least *some* way of disabling what many people view as > >>> undesirable behaviour. > [...] > > I agree with Chris that the bash behaviour here is not > > desirable. There's a number of articles on usenet or > > unix.stackexchange.com (many by me sorry, > > http://unix.stackexchange.com/search?q=bash+globstar+symlink) > > that say that the bash behaviour is broken in that regard and > > that globstar from other shells should be used instead. > > I put in a tentative change; we will see how it works. [...] That looks a lot better indeed. Thanks Chet. regards, Stephane
regression: extglobs expand hidden files in 4.3
With bash-4.3 as found on Debian's 4.3-2 package: $ bash -cO extglob 'echo *(.)' . .. $ bash -cO extglob 'echo *(foo)*' . .. a $ bash -cO extglob 'echo @(|foo)*' . .. a It looks like the regression was introduced by 4.3, as 4.2 doesn't exhibit the same problem. This one's OK: $ bash -cO extglob 'echo ?(foo)*' a regards, Stephane
Re: ctrl-c does not send INT to all processes under foreground job
2014-03-19 04:07:06 +, Ryan Ruan: [...] > It is said that "ctrl-C sends INT to ALL processes under foreground job", but > i found a weird phenomenon. > I have not read source code yet, but it does not seem to ascribe to what > specification says. > > Test code is like: > > 1 trap "echo hello world $1" 2 > 2 sleep 1 > 3 /bin/bash $0 $(( $1 + 1 )) > 4 echo "$1 go to sleep" > 5 sleep 1000 > 6 echo "$1 exit " > > > When I run ./test.sh on the console, the process (/bin/bash test.sh) is stuck > at line 2. > Then I input Ctrl-C, the result is that the code is interrupted at line 2 and > goes to execute line 3, thus generate a new process (/bin/bash test.sh 1). > At the same time, the first process (/bin/bash test.sh) is stuck at line 3 > waiting for process '/bin/bash test.sh 1' to finish. > At this time, I input Ctrl-C again, and generate process '/bin/bash test.sh > 2', and i think process '/bin/bash test.sh 1' SHOULD BE SENT signal INT now. > So I kill process '/bin/bash test.sh 2' by doing 'kill -9 '. > What amazes me is that process '/bin/bash test.sh 1' did not trap INT this > time, because "hello world 1" is not printed out. So it seems the process did > not receive INT before. > How can this be? Is it a bug? [...] I can't reproduce with any of bashs 3.2.48, 4.0, 4.1, 4.2, 4.3 on Linux 3.12 amd64 which exhibits the same behavior as any other shell I tried. What system are you on? Have you tried other shells? Do you possible have a SHELLOPTS, BASHOPTS or BASH_ENV in your environment? What should happen is that on the first ^C zsh └─bash ./a └─sleep 1 zsh being my interactive shell has set the process group running the script in foreground, so all processes in that group should receive the SIGINT. bash blocks the SIGINT to be able to handle it *after* sleep returns. sleep dies because of the SIGINT, then bash receives and handles the SIGINT so we see "hello world", and it goes on to run the next command: zsh └─bash ./a └─bash ./a 1 └─sleep 1 Now, three processes in that process group. ^C is sent to all of them but the bash processes block it. Upon termination of the sleep, "hello world 1" is printed but the first bash doesn't print its "hello world" because the second bash has not returned. And we now have: zsh └─bash ./a └─bash ./a 1 └─bash ./a 2 └─sleep 1 If you kill the "./bash ./a 2", "bash ./a 1", runs its echo "go to sleep 1" and after finishing, we should see the first bash handling its second SIGINT at last and display "hello world" again before going to sleep. -- Stephane.
Re: ctrl-c does not send INT to all processes under foreground job
2014-03-19 09:51:06 -0400, Chet Ramey: [...] > This is another instance of the question "what does bash do with a SIGINT > it receives while waiting for a foreground process?" The answer is that it > allows the child to decide what to do and responds accordingly: if the > child exits due to being killed by the SIGINT, bash acts as if it received > the SIGINT; if the child decides to trap it, do something else and not > resend the SIGINT to itself, bash will effectively discard it. In any > case, bash doesn't do anything until the child exits. > > There is a fairly long explanation of the issue at > http://www.cons.org/cracauer/sigint.html. [...] But that's about the bash -c 'sh -c "trap : INT; sleep 10"; echo seen; sleep 10' where bash and AT&T ksh (contrary to all other shells I tried) let you see "seen" above if you press Ctrl-C during the first sleep. But in the OP's case, we've got a "trap" in the outer shell. bash -c 'trap "echo ouch" INT; sh -c "trap : INT; sleep 10"; echo seen; sleep 10' Where you see "ouch" and "seen" in all shells (regardless of whether the inner sh traps the SIGINT or not). So it looks like a different issue.
minor: "read line" may read several lines if stdin is a tty
If you run (here testing on Linux): bash -c 'read a; echo "<$a>"; tr b c' And enter aaabbb You see "", but not "ccc". That's because "read" reads up to 128 bytes of data in one read(2) invocation instead of reading one byte at a time like on other types on non-seekable files. Probably not a big deal as one is unlikely to type . On the other end, when input is from the terminal, there's not much point optimising so you might as well read one byte at a time. See also: bash -c 'read a; echo "1: $a"; head -n 1; read b; echo "3: $b"' If typing abc, you see: 1: a c 3: b Instead of: 1: a b 3: c It's probably a bigger concern though if reading from a serial device (/dev/ttyS0) in raw mode for instance. -- Stephane
[doc] confusion over $0 and positional parameters
In info -f bash -n 'Invoking Bash' we find: > `-c' > Read and execute commands from the first non-option ARGUMENT after > processing the options, then exit. Any remaining arguments are > assigned to the positional parameters, starting with `$0'. $0 is generally *not* considered as a positional parameter. Both POSIX (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_01) and "info -f bash -n 'Positional Parameters'" have: > A positional parameter is a parameter denoted by the decimal > value represented by one or more digits, other than the single > digit 0 explicitetly excluding $0. That seems to be causing some confusion https://unix.stackexchange.com/questions/152391 Maybe a better wording would be: > `-c' > Read and execute commands from the first non-option ARGUMENT after > processing the options, then exit. The first argument > after that is assigned to $0 which is used in error > messages for instance (in the abscence of arguments, $0 > is assigned to the argv[0] that the interpreter received > (usually bash or sh)), and the following arguments are > assigned to the positional parameters. The note of $0 being used for error messages would also be useful as we see people using a place-holder like _ or -- there causing more confusion. See for instance: http://mywiki.wooledge.org/BashFAQ/012 -- Stephane
Re: minor: "read line" may read several lines if stdin is a tty
2014-08-27 08:10:01 -0400, Chet Ramey: > On 8/25/14, 4:24 PM, Stephane Chazelas wrote: > > If you run (here testing on Linux): > > > > bash -c 'read a; echo "<$a>"; tr b c' > > > > And enter aaabbb > > > > You see "", but not "ccc". That's because "read" reads up > > to 128 bytes of data in one read(2) invocation instead of > > reading one byte at a time like on other types on non-seekable > > files. > > While this is true, it's not complete. Bash relies on the terminal > driver returning newline-delimited lines when reading from the > terminal. If you try to fool it, you can probably succeed. [...] I agree is clearly an attempt at fooling it and it's reasonable to expect a read on a terminal in canonical mode to return a single line. However, one could imagine using bash's read to get data off a tty device not in canonical mode (a serial device used as just a serial device), or any other non-terminal character device for that matter (a "strace bash -c 'read < /dev/urandom'" reveals there's an issue there as well), and that would be more of a problem. Again, not a big one (hence the [minor] tag) but still a non-POSIX conformance and bash seems to be the only shell with the problem (though ksh93 has its own bugs as usual). -- Stephane
Re: minor: "read line" may read several lines if stdin is a tty
2014-08-27 15:35:06 -0400, Chet Ramey: > On 8/27/14, 3:20 PM, Stephane Chazelas wrote: > > > However, one could imagine using bash's read to get data off a > > tty device not in canonical mode (a serial device used as just a > > serial device), or any other non-terminal character device for > > that matter (a "strace bash -c 'read < /dev/urandom'" reveals > > there's an issue there as well), and that would be more of a > > problem. > > > > Again, not a big one (hence the [minor] tag) but still a > > non-POSIX conformance and bash seems to be the only shell with > > the problem (though ksh93 has its own bugs as usual). > > Well, again you have to rely on the system telling you what's happening. > Bash tests whether a file descriptor is seekable using, logically > enough, lseek. If it doesn't return -1/ESPIPE, bash assumes the fd is > seekable. If it does, bash does unbuffered reads. [...] OK, sorry, I hadn't realised /dev/random was seekable (though the seek is a no-op) on Linux. > (If isatty() returns true, bash does assume that read(2) will return > newline-delimited records.) [...] That assumption is only valid (to some extent, because of lnext) if the terminal is in icanon mode though. Cheers, Stephane
Re: make 'tr' (or something like it) a bash builtin ?
On Sun, Feb 05, 2006 at 07:02:10PM -0500, Chris F.A. Johnson wrote: > On Sun, 5 Feb 2006, Felipe Kellermann wrote: > > >On Thu, 2 Feb 2006 11:13pm -0500, Mike Frysinger wrote: > > > >>upper case or lower case ... if 'tr' was a bash builtin, then that'd work > >>with some `echo | tr` magic, but it something like > >>${foo//[[:lower:]]/[[:upper:]]} would be nice > >>-mike > > > >typeset -u (ksh, zsh) supported in bash would be nice... > > > > orchid% typeset -u fuu > >So long as it uses declare, not typeset :) > > > orchid% fuu=bar > > orchid% echo $fuu > > BAR > >I'd prefer variable expansion, perhaps the way Mike suggested, or, >e.g.: > > $ foo=bar > $ echo ${foo^} ## Convert first character > Bar > $ echo ${foo^^} ## Convert all characters > BAR > $ echo ${foo^[a-m]} ## Convert first character that matches pattern > Bar > $ echo ${foo^^[a-m]} ## Convert all characters that match pattern > BAr [...] Also note zsh parameter expansion flags that can be easily combined: $ foo=bar $ echo ${(U)foo} BAR $ echo ${(Ul:20::.:)foo} .BAR $ echo ${(Uqq)foo} 'BAR' -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: bash shell parser bug
On Thu, Mar 16, 2006 at 08:28:28PM -0800, laura fairhead wrote: > > > Hello, Hi Laura, > I just found a bug that affects a number of shells (pressumably the > code there is from the same roots) in the parser. > > The following code; > > l='eval "$l"' > eval "$l" > > Which sets off an infinite recursion on 'eval', should result in an > infinite loop to be terminated by INT (doesnt' work) or at least > end gracefully with an error "bash: out of memory". Instead the > system has to kill the shell process because of SEGV fault. [...] It's the same with f() { f; }; f actually. ~$ bash -c 'f() { f; }; f' zsh: segmentation fault bash -c 'f() { f; }; f' (139)~$ zsh -c 'f() { f; }; f' zsh: segmentation fault zsh -c 'f() { f; }; f' (139)~$ ksh -c 'f() { f; }; f' zsh: segmentation fault ksh -c 'f() { f; }; f' (139)~$ ash -c 'f() { f; }; f' zsh: segmentation fault ash -c 'f() { f; }; f' (139)~$ gawk 'function f() { f() } BEGIN { f() }' zsh: segmentation fault awk 'function f() { f() } BEGIN { f() }' (139)~$ perl -le 'sub f { f(); }; f' zsh: killed perl -le 'sub f { f(); }; f' (137)~$ For perl, it was probably the OOM killer, my Xserver was killed as well. TCL: ~$ expect -c 'proc f {} { f }; f' 2>&1 | tail "f " (procedure "f" line 1) invoked from within "f " (procedure "f" line 1) invoked from within "f " (procedure "f" line 1) invoked from within "f" The recursion limit seems to be 1000, there. Same for: expect -c 'set f {eval $f}; eval $f' Same thing in python (limit 1000 as well) and ruby (limit seems higher). ~$ perl -le '$f = "eval \$f"; eval $f' is still running but is slowly grabbing all the memory. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: check_dev_tty - What does it do?
On Fri, Jun 23, 2006 at 02:54:04PM +1200, John Carter wrote: > > Here is the bash-3.1 code from general.c for check_dev_tty. > > It is invoked shortly after main... > > void > check_dev_tty () > { > int tty_fd; > char *tty; > > tty_fd = open ("/dev/tty", O_RDWR|O_NONBLOCK); > > if (tty_fd < 0) > { > tty = (char *)ttyname (fileno (stdin)); > if (tty == 0) > return; > tty_fd = open (tty, O_RDWR|O_NONBLOCK); > } > close (tty_fd); > } > > It seems to open and close /dev/tty and return nothing. > > This seems very strange to me. > > Why is it doing this? [...] A rough guess: It may be that, if bash is the session leader and no session is attached to the terminal open on stdin, the open would attach the terminal to the session. Bash would become the controlling process. I guess it might happen when you boot Linux with init=/bin/bash for instance. It may not be the reason for it, though. And I'm not sure it is right for bash to do so. -- Stephane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: PATH and $0
On Tue, Jul 11, 2006 at 08:19:34PM -0400, Dave Rutherford wrote: > On 7/11/06, Cai Qian <[EMAIL PROTECTED]> wrote: > >I want its full pathname using 'dirname', but it will give me > >unexpected result on some Linux or Bash versions. > > Well, 'dirname' certainly won't do what you want, but I'm sorry, > I can't think of a way to get what you need. (It would be relatively > easy in 'c'.) Even /proc/self/* doesn't contain the script's full > pathname. Perhaps somebody else knows a better way. [...] $0 will always contain the file path, unless the script was started as: bash script.sh And there's no script.sh in the current directory (in which case sh/bash will have looked up script.sh in $PATH). So: #! /bin/sh - dir=$( cmd=$0 [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit dir=$(dirname -- "$cmd") cd -P -- "$dir" && pwd -P ) || exit # untested should give you the absolute path of the directory portion of the script path (unless that directory ends in newline characters). -- Stephane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: PATH and $0
On Thu, Jul 13, 2006 at 11:53:00PM -0600, Bob Proulx wrote: > Stephane Chazelas wrote: > > $0 will always contain the file path, unless the script was > > started as: > > > > bash script.sh > > > > And there's no script.sh in the current directory (in which case > > sh/bash will have looked up script.sh in $PATH). > > Of course using command as you have done will work for bash. "command" is a POSIX command. So will work with every POSIX and therefore every Unix conformant sh. Note that on some systems, the Unix conformant sh is not in /bin, you may need to adapt the she-bang line (note that the she-bang line is a non-standard Unix feature). > But I > always feel better about using portable shell as much as possible. > here is one way. > > pathfind() { > OLDIFS="$IFS" > IFS=: > for p in $PATH; do > if [ -x "$p/$*" ]; then > IFS="$OLDIFS" > echo "$p/$*" > fi > done > IFS="$OLDIFS" > } On the contrary the code above has many problems: 1- If IFS was unset before, it becomes set to the empty string after which has a different meaning. 2- word splitting when IFS contains non-blank characters varies from one shell to the next, and in most implementations (including bash and AT&T ksh, it differs from the way the shell splits $PATH internally to look up a command ("/bin:" is "/bin" and "" as far as PATH look up is concerned while bash world splitting splits "/bin:" into only "/bin"). 3- you forgot to disable filename generation, which means wildcards will be expanded in "for p in $PATH" 4- there's a "break" (or return 0) missing, a "return 1" missing if there's no match. 5- echo is not a portable command, use printf instead as POSIX suggests. > > > #! /bin/sh - > > dir=$( > > cmd=$0 > > [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit > > dir=$(dirname -- "$cmd") > > cd -P -- "$dir" && pwd -P > > ) || exit > > # untested > > > > should give you the absolute path of the directory portion of > > the script path (unless that directory ends in newline > > characters). > > One thing to note about this script is that it canonicalizes path with > respect to symlinks. You do say that but without much > fanfare. That was intended. You can probably find as many caveats with one approach or the other. Note that if the binary file is a symlink itself, it doesn't canonalise it. [...] > I think it is best to accept whatever path the user has provided > verbatim. I would not try to canonicalize the path in any way. If > the user has provided a path with /foo/bar/../../zoo/zam for example > then I would just go with it because that path may actually be > necessary just like that for some reason as yet unknown at this moment > to us writing the code to be used in the future. The reason I think > this way is that I have tried to do things like canonicalizing > previously myself and gotten burned by it. I have learned it is > better to avoid doing such things. Then: dir=$( cmd=$0 [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit dir=$(dirname -- "$cmd") case $dir in /*) ;; *) dir=$(pwd -P)/$dir || exit;; esac printf '%s\n' "$dir" ) || exit -- Stephane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: PATH and $0
On Sat, Aug 12, 2006 at 01:51:12AM -0600, Bob Proulx wrote: > Stephane Chazelas wrote: > > [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit > > Bob Proulx wrote: > > > Of course using command as you have done will work for bash. > > > > "command" is a POSIX command. So will work with every POSIX and > > therefore every Unix conformant sh. > > That is not strictly true. While 'command' is POSIX 'command -v' > is actually an optional extension. A system only claiming POSIX > conformance need not implement 'command -v'. See the online standards > documents for more information. > > http://www.opengroup.org/onlinepubs/009695399/utilities/command.html You're right. command -v is not required by POSIX, but required by Unix (XSI). A lot of shells (like bash) are only POSIX conformant. > And in fact the FreeBSD /bin/sh (among others) implements 'command' > but not 'command -v'. So in practice it is not as portable as I would > like it to be. According to the man page, it does at least in 6.1-RELEASE. See: http://www.freebsd.org/cgi/man.cgi?query=sh&sektion=1&apropos=0&manpath=FreeBSD+6.1-RELEASE [...] > > 1- If IFS was unset before, it becomes set to the empty string > > after which has a different meaning. > > Your posted script did not unset IFS. Therefore that is not a concern > there. I would not prefer 'command -v' over this because someone > might unset IFS. There are lots of worse things that are possible. I was speaking of the general case: oIFS=$IFS ... IFS=$oIFS doesn't restore IFS in every case. > > 2- word splitting when IFS contains non-blank characters varies > > from one shell to the next, and in most implementations > > (including bash and AT&T ksh, it differs from the way the shell > > splits $PATH internally to look up a command ("/bin:" is "/bin" > > and "" as far as PATH look up is concerned > > Negative. In AT&T ksh (and others) "/bin:" is "/bin" and "." for PATH > lookup. No, it's "/bin" and "". the shell tries to exec "/bin/foo" then "foo", not "/bin/foo" and "./foo". But any what I meant was "/bin:" means /bin and the current directory for PATH lookup, but not for bash word splitting. > > while bash world splitting splits "/bin:" into only "/bin"). > > Negative. In bash "/bin:" is "/bin" and "." for PATH lookup. I was speaking of "bash word splitting" not PATH lookup, that was my whole point: you can't use word splitting that way because it doesn't match the way the shell does the PATH lookup. > > > 3- you forgot to disable filename generation, which means > > wildcards will be expanded in "for p in $PATH" > > Negative. Wildcards will not be expanded there. Yes, they will: Try: foo='a:*:b' IFS=: for i in $foo; do echo "$i"; done regards, Stephane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: ...Limitation?
On Tue, Sep 26, 2006 at 11:45:42AM -0400, Paul Jarc wrote: > mwoehlke <[EMAIL PROTECTED]> wrote: > > I am trying to figure out how to run a command and pipe the output > > through tee, and then check the status of the original command. > > This uses a bash-specific feature: > cmd > >(tee file); status=$? This is a ksh feature that can also be found in zsh. But it's true it's not standard. I may be wrong but I don't think bash has any significant feature of its own. What is bash specific is the $PIPESTATUS array. zsh also has it but it's called $pipestatus there as zsh arrays are traditionally lowercase (so that they are not confused with scalar variables (all bash variables are both array and scalar at the same time as in ksh)). cmd | tee file and cmd exit status can be found in ${PIPESTATUS[0]}. > This should work on any sh: > exec 3>&1 && status=`exec 4>&1 && { cmd; echo $? >&4; } | tee file >&3` You may want to write it: exec 3>&1 && status=`exec 4>&1 && { cmd 4>&-; echo $? >&4; } | tee file >&3` because otherwise, if cmd spawns a process, and that process doesn't close its fd 4, you'd have to wait for it to finish (you may have the same problem if it doesn't close its fd 1 because of the pipe to tee, though). > Or, if you don't want to clobber any descriptors, in case they might > be in use for something else: [...] Then do it in a subshell (unless cmd makes use of those file descriptors) or do it this way instead of using exec (bash will take care of restoring the fds): { status=$( { { cmd 4>&- echo "$?" >&4 } 3>&- | tee file >&3 3>&- } 4>&1 ) } 3>&1 This should work in any Unix or POSIX conformant sh, not in the Bourne shell. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Fatal bug with redirection
On Wed, Oct 04, 2006 at 03:38:12PM -0400, Yang wrote: > On my FC5 box, when I try certain things in bash 3.1.7(1)-release > involving redirection, such as the following, bash crashes > (immediately exits): > > { echo a >&3 ; } 3> >( echo b ) > > I've only tried this from an interactive shell. Unfortunately I don't > have bashbug or bashversion, as I'm using the FC5 RPM version, but > here is some info: [...] All the shells I could try that where echo is builtin behave the same in that case (die). Above, echo is writing to a pipe whose other end is closed (echo b has terminated already by the time "echo a" starts to write on its standard output), so echo will get a SIGPIPE signal. The problem is that, echo being builtin, it's the shell that is doing the write to the dead pipe, and so it's the shell that is getting the SIGPIPE. SIGPIPE is a fatal signal. However, it's true that a builtin echo is not meant to behave differently than an external echo command. So bash (and other shells) should try and work around that by only causing the echo command to terminate with a 141 exit status. The problem is not only for echo. test \( ! 2> >(:) read % 2> >(:) -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Curly braces expansion not always works as expected.
On Thu, Oct 05, 2006 at 12:23:41AM +0200, Reuti wrote: [...] > Curly braces expansion only works with at least one comma > (,). This also works, if the appears at least once > inside some apostrophes and has no function. > > Repeat-By: > [EMAIL PROTECTED]:~> bash --version > GNU bash, version 2.05b.0(1)-release (powerpc-apple-darwin8.0) > Copyright (C) 2002 Free Software Foundation, Inc. > [EMAIL PROTECTED]:~> echo {"x x"} # No {} in output expected. > {x x} > [EMAIL PROTECTED]:~> echo {"xx"} # No {} in output expected. > {xx} > [EMAIL PROTECTED]:~> echo {"xx","yy"} # Works fine. > xx yy > [EMAIL PROTECTED]:~> echo {"x,x"} # Add a comma to get it > working, i.e. remove the {}. > x,x > [EMAIL PROTECTED]:~> > > Also this seems to be strange: > > [EMAIL PROTECTED]:~> echo {"klklkl"} {1,2,3} # Only second {} > works correctly. > {klklkl} 1 2 3 > [EMAIL PROTECTED]:~> echo {"klklkl"}{1,2,3} # Completely messed up. > {klklkl}{1,2,3} Only that one is a bug, it should output {klklkl}1 {klklkl}2 {klklkl}3 The other ones work as documented. From info -f bash -n 'Brace Expansion' | A correctly-formed brace expansion must contain unquoted | opening and closing braces, and at least one unquoted comma | or a valid sequence expression. Any incorrectly formed brace | expansion is left unchanged. > [EMAIL PROTECTED]:~> echo {"kl,klkl"}{1,2,3} # Add a comma to > get it working > kl,klkl1 kl,klkl2 kl,klkl3 > [EMAIL PROTECTED]:~> [...] -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: case statement breaks $( ) substitution
On Fri, Oct 06, 2006 at 11:55:27AM +0200, [EMAIL PROTECTED] wrote: [...] > Repeat-By: > > $ echo $( > > case a in > > 1) :;; > bash: syntax error near unexpected token `;;' > > The closing `)' is misinterpreted as being the end of the `$(' > substitution, instead of being recognized as element of the > case block which it is part of. [...] It's a known limitation also in zsh and some versions of ksh. Use $( case a in (1) ;; esac ) instead. -- Stephane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: How to detect bash?
On Tue, Oct 10, 2006 at 05:12:07PM -0500, mwoehlke wrote: > Dave Rutherford wrote: > >On 10/10/06, mwoehlke <[EMAIL PROTECTED]> wrote: > >>Anyone have any clever, VERY reliable tricks for detecting if the > >>current shell is bash? > > > >Well, I don't know if it's clever, but how about: > > Oh, my... Where do I *start*? > > >$ if [ "${SHELL//*/bash}" = "bash" ]; then echo y; fi > > $ echo $SHELL > /bin/sh > $ echo $BASH > /bin/bash > $ foo > bash: foo: command not found > > There is *ABSOLUTELY* no guarantee that $SHELL correctly points to bash, > or that $SHELL is even remotely correct for that matter. This is /worse/ > than relying on $BASH. > > But it does bring up an interesting possibility: > [ "`/dev/null 2>&1`" = "bash: /dev/null: Permission denied" ] [...] You quest looks a bit pointless to me. What prevents the user to edit a copy of your script to remove the check anyway? $ zsh -c 'echo "`/dev/null 2>&1`"' bash bash: /dev/null: Permission denied $ zsh $ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH' bash: /dev/null: Permission denied $ echo '/dev/null(){echo "bash: /dev/null: Permission denied"}' \ > >> ~/.zshenv $ zsh -c 'echo "`/dev/null 2>&1`"' bash: /dev/null: Permission denied And whatever check you do can be worked around in a way or another. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: How to detect bash?
On Tue, Oct 10, 2006 at 06:14:22PM -0500, mwoehlke wrote: [...] > >$ zsh -c 'echo "`/dev/null 2>&1`"' bash > >bash: /dev/null: Permission denied > > > >$ zsh > >$ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH' > >bash: /dev/null: Permission denied > > Eh? I get: > > $ zsh -c 'echo "`/dev/null 2>&1`"' bash > zsh:1: permission denied: /dev/null Well, I do get what I said with zsh 4.3.2 > $ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH' > /dev/null: permission denied (note that this command must be run from zsh which uses ARGV0=... to set ash's argv[0]). You must have an older /ash/ than I have mine is one of the OpenBSD based sh ones (found on debian). > > So neither of your counter-examples is working for me (although both > look like they *should*; go figure). Though I'd bet the third one with .zshenv worked. > But since you didn't counter > BASH_SUBSHELL (and since I'm too lazy to change it now) I guess I'll > stick with that. :-) BASH_SUBSHELL is a relatively recent addition to bash. Most systems still don't have a bash3. $ bash -c 'echo "$BASH_VERSION, <$BASH_SUBSHELL>"' 2.05b.0(2)-release, <> [...] > True, but the main point of the exercise is to go with a check that's > unlikely to be worked around "by accident". If someone intentionally > circumvents the check (and you're right, editing the script would be > easy), well then they deserve whatever happens. But I *am* paranoid > enough to not trust that $BASH is never set - other than by bash - for > some reason. Or that it hasn't been *unset* (since that seems to kill it > forever), because we have 'clean environment' scripts that would do this > sort of thing. [...] Then I'd go with $BASH_VERSION as someone else suggested which is in every bash version. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: ${//}: say how to non-destructively prefix/suffix
On Mon, Nov 06, 2006 at 02:54:25AM +0800, Dan Jacobson wrote: > Regarding the >${parameter/pattern/string} >${parameter//pattern/string} > discussion on the man page: > You don't mention what if pattern is null. > Seems that will never match. > Anyway, no way to > $ set aa bb cc > $ echo [EMAIL PROTECTED] > so that one gets all parameters to have a string appended, > e.g. aa.t bb.t cc.t > One must sacrifice at least one character in "pattern". No way to > destroy less than one character. [...] echo "${@/%/.t}" -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: strange expansion of <(cat p.main.{optional,extra})
On Tue, Apr 17, 2007 at 08:16:53PM -0400, Chet Ramey wrote: [...] > > The expansion of echo <(cat /etc/{passwd,motd}) is rather surprising: > > $ echo <(cat p.main.{optional,extra}) > > ++ cat p.main.optional > > cat: p.main.optional: No such file or directory > > ++ cat p.main.extra > > cat: p.main.extra: No such file or directory > > + echo /dev/fd/63 /dev/fd/62 > > /dev/fd/63 /dev/fd/62 > > > > I would have expected to be expanded to <(cat /etc/passwd /etc/motd) first. > > That's not how it works. Brace expansion is the first expansion performed. > The manual page says: > > "Brace expansion is performed before any other expansions, and any char- > acters special to other expansions are preserved in the result. It is > strictly textual. Bash does not apply any syntactic interpretation to > the context of the expansion or the text between the braces." [...] How does it get exanded to echo <(cat p.main.optional) <(cat p.main,extra) though? If it were textual, I would expect: echo <(cat p.main.optional) p.main,extra) And what about: echo $(cat p.main.{optional,extra}) This resolves to echo $(cat p.main.optional p.main.extra) I can't see any reason why $(...) and <(...) should be treated differently. bash-3.1$ echo <(echo {a,b}) ++ echo a ++ echo b + echo /dev/fd/63 /dev/fd/62 /dev/fd/63 /dev/fd/62 bash-3.1$ echo $(echo {a,b}) ++ echo a b + echo a b a b All the other shells treat them the same in that regards. Any expansion inside command or process subsitution should be defered to the subshell I think. % echo <(echo {a,b}) +zsh:1> echo a b +zsh:1> echo /proc/self/fd/11 /proc/self/fd/11 % echo $(echo {a,b}) +zsh:1> echo a b +zsh:2> echo a b a b -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash is broken
On Sun, May 13, 2007 at 12:31:48AM -0700, Overdorf, Sam wrote: > The following does not work on my computers: [...] > If [ "1" > "2" ] [...] > Bash thinks this is I/O redirection and creates a file by the name of > "2" in my directory. > > This should be a string compare. [...] Hi. No, this shouldn't. "[" is a command just as every other one. And on a command line, the redirection operators can be put anywhere. echo 1 > 2 ] is the same as > 2 echo 1 ] or echo 1 ] > 2 If you want, you can use the [[ ... ]] syntax instead. [[ is not a command, [[ ... ]] as a whole is a command by itself if you want and what's inside is treated as a text expression, not as arguments and redirections to a command. [[ 1 > 2 ]] If you want to use "[" (aka test), you need to make sure ">" is passed as an argument to "[". [ 1 '>' 2 ] or [ 1 \> 2 ] [[ ... ]] is not a standard shell feature, but the > test operator is not either. Portably (as in Unix shell scripts), you'd use expr: if expr "x$1" \> "x$2" > /dev/null or awk. if awk 'BEGIN{exit !(ARGV[1]"" > ARGV[2]"")}' "$1" "$2" Finally, it should be noted that string comparison behavior depends on the current locale. So if you don't want your script behavior to depend on who is running it, you may want to fix the locale: LC_ALL=C expr ... -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash is broken
On Sun, May 13, 2007 at 08:19:13AM -0600, Eric Blake wrote: [...] > if [[ 1 > 2 ]] > > > This should be a string compare. > > Actually, when quoted properly, it should be a numeric comparison, not a > string comparison. [...] It *is* a string comparison: $ bash -c '[[ 02 > 1 ]]' || echo "02 is not greater than 1" 02 is not greater than 1 [[ a > b ]], [ a '>' b ] are string comparisons. [ a -gt b ], [[ a -gt b ]], (( a > b )) are numeric comparisons. expr's > is both depending on whether the operands are recognised as integers or not. in awk, it depends on the type of the arguments. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: wrong logical evaluation of expressions involving true or false commands
On Mon, Jun 25, 2007 at 02:07:56PM -0600, Bob Proulx wrote: [...] > following. These are the same. > > [ foo -a bar ] > [ -n foo -a -n bar ] > > This is defined by standard by the number of arguments. > > http://www.opengroup.org/onlinepubs/009695399/utilities/test.html [...] And it should be noted (and it's noted as well in the page you're refering to) that while the above is true for strings such as "foo" and "bar", it is not true in the general case. [ -n "$foo" -a -n "$bar" ] is not the expression to test whether both "$foo" and "$bar" are non-empty, as it would fail for some specific values of $foo or $bar (try it when $foo contains "=" for instance). That's why it's useful to remember those rules: - don't call "[" with more than 3 arguments (beside the "[" and "]" ones). That is use [ -f "$f" ] && ! [ -h "$f" ] instead of [ -f "$f" -a ! -f "$f" ] for instance. That at least would keep you out of troubles with POSIX conformant shells. Then for shells that are still not POSIX conformant (such as the Bourne shell, or the BSD shs (ash) or some old kshs). - prefer [ -n "$string" ] over [ "$string" ] - [ foo = "$var" ] over [ "$var" = foo ] - [ "x$a" = "x$b" ] over [ "$a" = "$b" ] - [ whatever != whatever ] over [ ! whatever = whatever ] Note that the [[ ... ]] and (( ... )) contructs don't have such issues, but those are not standard contructs. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: logical XOR
On Fri, Jun 29, 2007 at 10:21:01AM +0200, Robert Millan [ackstorm] wrote: [...] > > > There's bitwise AND and bitwise OR, and logical AND and logical OR, but > > > for XOR there's only the bitwise version. Would be nice if the logical > > > XOR would also be present (^^). > > > > (!a) != (!b) > > > > should work acceptably for the situations such an operator would be used. > > I'm not sure what you mean. The following: > > (true && ! false) || (false && ! true) > echo $? > > works, whereas the following: > > (! true) != (! false) > echo $? > > shows syntax error. > > Is there any way to avoid redundancy in the shell script that needs this? The > code in question is already quite complex, so adding this redundancy makes it > less readable. [...] Do you mean that for instance, you'd like a third command to be executed if either but not both of two commands succeed? cmd1; a=$? cmd2; b=$? (( !a != !b )) && cmd3 should do. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: logical XOR
On Fri, Jun 29, 2007 at 12:08:22PM +0200, Robert Millan [ackstorm] wrote: > On Fri, Jun 29, 2007 at 10:16:18AM +0100, Stephane Chazelas wrote: > > > > Do you mean that for instance, you'd like a third command to be > > executed if either but not both of two commands succeed? > > > > cmd1; a=$? > > cmd2; b=$? > > (( !a != !b )) && cmd3 > > > > should do. > > Thank you, that works. Although I think ^^ would be more intuitive. Let me > know if you want a patch for that. [...] You could do something like: xor() { previous_status=$? "$@" return "$((!$previous_status == !$?))" } cmd1; xor cmd2 && cmd3 -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: printf '\x' exit status is 0
On Tue, Jul 10, 2007 at 09:35:56PM -0400, Chet Ramey wrote: > Poor Yorick wrote: > > (on my debian system, bashbug reports for bash 2.05) > > > > printf status is 0 but /usr/bin/printf is 1. Is this the expected behavior? > > > > [EMAIL PROTECTED]:~$ bash3 > > [EMAIL PROTECTED]:~$ bash3 --version > > GNU bash, version 3.00.16(1)-release (i386-pc-linux-gnu) > > Copyright (C) 2004 Free Software Foundation, Inc. > > [EMAIL PROTECTED]:~$ printf '\x' > > bash3: printf: missing hex digit for \x > > [EMAIL PROTECTED]:~$ echo $? > > 0 > > Bash treats unrecognized backslash escapes as characters to be printed > and does not consider them an error. \x without any hex digits doesn't > have any special meaning, though, since it's probably not what's > intended, bash prints a warning. [...] Note that both AT&T ksh and zsh output a NUL byte and return success there. pdksh derivatives (pdksh, mksh, posh) output an error message and return 1. ash derivatives output "\x" and return success. The behavior of printf upon \x is /unspecified/ as per POSIX, so all the behaviors observed are valid. A portable application shouldn't use printf '\x'. The octal representation (\o, \oo or \ooo) is standard. -- Stéphane ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash