builtin exit status on write failure
Configuration Information [Automatically generated, do not change]: Machine: i686 OS: cygwin Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash.exe' -DCONF_HOSTTYPE='i686' -DCONF_OSTYPE='cygwin' -DCONF_MACHTYPE='i686-pc-cygwin' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DRECYCLES_PIDS -I. -I/home/eblake/bash-3.2.15-14/src/bash-3.2 -I/home/eblake/bash-3.2.15-14/src/bash-3.2/include -I/home/eblake/bash-3.2.15-14/src/bash-3.2/lib -O2 -pipe uname output: CYGWIN_NT-5.1 LOUNGE 1.5.24(0.156/4/2) 2007-01-31 10:57 i686 Cygwin Machine Type: i686-pc-cygwin Bash Version: 3.2 Patch Level: 15 Release Status: release Description: POSIX requires any application that writes to stdout to detect write failure, and exit with non-zero status as well as write a diagnostic to stderr. Repeat-By: One example of a failure to follow this rule, using Linux's /dev/full to provoke a write failure: $ cd /bin $ cd $ cd - >/dev/full $ echo $? 0 $ pwd /bin Oops - there was a write failure; yet no error message printed, the exit status remained zero, and the working directory changed. Fix: All of the bash builtins that write to stdout need to check for ferror before completion, and change the exit status and print a message accordingly. ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: [Patch] .gitignore TAGS and tags
On 3/15/21 3:42 PM, Chet Ramey wrote: > On 3/15/21 3:57 PM, Mike Jonkmans wrote: >> On Mon, Mar 15, 2021 at 11:23:46AM -0400, Chet Ramey wrote: >>> On 3/15/21 3:29 AM, Mike Jonkmans wrote: >>>> I assume that the TAGS and tags files will not go into the repo. >>> >>> Why not? This is only the devel branch; they don't go into releases. >> >> Adding tags/TAGS to the repo would increase its size for almost no use. >> Creating the tags file takes less than a second. > > The size is inconsequential. > >> Drawback of not having these in the repo and not in .gitignore >> is that a 'git status' complains about untracked files. > > OK, this is a good reason. But even if the upstream repo doesn't want to ignore a file in the (checked-in) .gitignore, you can always edit your (local-only) .git/info/exclude to exclude your extra files locally. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
zsh style associative array assignment bug
Hey, When doing an assignment with an uneven number of elements bash currently silently treat the last element as a key and assigns it an empty string. $ typeset -A ary=(this feature came from zsh); typeset -p ary declare -A ary=([came]="from" [this]="feature" [zsh]="" ) In zsh this is an error, % typeset -A ary=(this feature came from zsh); typeset -p ary zsh: bad set of key/value pairs for associative array Could bash be adjusted to align with zsh in this case?
Re: zsh style associative array assignment bug
On 3/28/21 12:25 AM, Oğuz wrote: Why? I think it's better this way. -- Oğuz 1) For consistency sake with the shell the idea was borrowed from mostly. 2) Prior to this extension bash required specifying the key and value for AA assignments, so it seems weird to silently ignore that a value wasn't given now. 2.5) I subjectively think passing an odd number of elements to declare is more often than not to be a mistake that the user would be interested in knowing about. With the way it is now, you could save a few characters to do a seen array. $ while read -r key; ... seen+=("$key") ... but not really much else.
Re: zsh style associative array assignment bug
On 3/28/21 7:02 AM, Oğuz wrote: As it should be. `[bar]' doesn't qualify as an assignment without an equals sign, the shell thinks you're mixing two forms of associative array assignment there. In the new form, that a key is listed inside a compound assignment alone implies that it was meant to be assigned a value. In my mind, `a=(foo 123 bar)' translates to `a=([foo]=123 [bar]=)'. It makes sense. That is the point that i am making, in typeset -A ary=([key]=) an explicit empty string is the value, but in the case of typeset -A ary=([key]) it was historically an error. So why should an key without a value now be acceptable?
Re: zsh style associative array assignment bug
On 3/29/21 5:18 PM, Chet Ramey wrote: If you look at a=( k1 v1 k2 v2 k3 v3) as more or less syntactic sugar for a=( [k1]=v1 [k2]=v2 [k3]=v3 ) it's reasonable that a=( k1 v1 k2 ) is equivalent to a=( [k1]=v1 [k2]= ). And that's what bash does. Its just when populating that array dynamically with another array if that second array didn't contain `v1' hypothetically, the array gets shifted to a=( [k1]=k2 [v2]=k3 [v3]= ) which i would imagine to be unexpected for the author of the code and would rather it error out instead of chugging along.
Re: zsh style associative array assignment bug
On 3/30/21 10:54 AM, Chet Ramey wrote: > On 3/29/21 6:40 PM, Eric Cook wrote: >> Its just when populating that array dynamically with another array >> if that second array didn't contain `v1' hypothetically, the array gets >> shifted to > > OK, how would you do that? What construct would you use in this scenario? > Sample input: $ exiftool -j *.flac | jq -r '.[]| {Artist, Track, Genre, Title}|to_entries[]| .key + "|" + .value' Artist|AK420 Track| Genre|lofi Title|A2 - Northern Lights -- typeset -A tags=(); set -- while IFS='|' read -ra ary; do set -- "$@" "${ary[@]}" done < <( exiftool -j *.flac | jq -r '.[]| {Artist, Track, Genre, Title}|to_entries[]| .key + "|" + .value' ) eval 'tags=('"${*@Q}"\) typeset -p tags declare -A tags=([lofi]="Title" [Track]="Genre" [Artist]="AK420" ["A2 - Northern Lights"]="" ) >> a=( [k1]=k2 [v2]=k3 [v3]= ) >> which i would imagine to be unexpected for the author of the code and would >> rather >> it error out instead of chugging along. > > Wouldn't this be a programming error? If this were a concern, since any > array can have elements with empty values, I would recommend a different > strategy to copy it. > Yeah, it is a programming error that could've used better validation. I just find it weird that the assumption of an assignment with an odd number of elements with this new syntax is that the odd number element is always a key missing a value that is filled in. when any of the keys or values could've been missing during the assignment.
Re: zsh style associative array assignment bug
On 3/30/21 3:44 PM, Chet Ramey wrote: Is this a serious piece of code, or just one to demonstrate a programming error? The latter There is only one field, terminated by `|', which becomes one array element. This is where you `lose' the null elements, not when you attempt to copy. Nothing you do after it matters. I wasn't trying to imply that there ever was an element that was `lost'. (or code golf for the bystanders trying to optimize) just that the author of an script can make an incorrect assumption on input and eventually try to pass that along to an AA assignment that bash happily accepts and makes an opinionated assumption of it's own, guessing at the author's intent. bash's assumption can be equally as wrong compared to erroring out, telling the user that something is wrong here. Your point that the bash method of key-value pair assignment doesn't protect you from programming errors is valid. Thank you.
Re: I've found a vulnerability in bash
On Fri, Nov 19, 2021 at 03:56:21PM +, Kerin Millar wrote: > On Fri, 19 Nov 2021 10:05:39 -0500 > Marshall Whittaker wrote: > > > Fair. I'm not saying anyone has to change it, but I will call out what I > > think is a design flaw. But this is going to turn into some philosophical > > discussion as to whether it should have been done this way from the start. > > That I don't know, and hold no responsibility for, as I'm not a bash dev, > > I'm an exploit dev. Maybe an asshole too. > > You appear to be missing the implication; it has nothing in particular to do > with bash. Consider the following Perl program. At no point is a shell > involved. > > @args = glob('*'); > system('rm', '-f', @args); # bad I had to double-check you via 'strace -f -e execve ...', but you are right, for this particular example. But according to 'perldoc -f system', there ARE instances where perl's system() involves a shell: Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is "/bin/sh -c" on Unix although /bin/sh is not always bash. But that brings up a bug in perl(1): $ strace -f -e execve perl -e 'system("echo \$HOME")' execve("/usr/bin/perl", ["perl", "-e", "system(\"echo \\$HOME\")"], 0x7ffc3e642e58 /* 72 vars */) = 0 strace: Process 1248831 attached [pid 1248831] execve("/bin/sh", ["sh", "-c", "echo $HOME"], 0x55d3099d69d0 /* 72 vars */) = 0 /home/eblake [pid 1248831] +++ exited with 0 +++ --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1248831, si_uid=14986, si_status=0, si_utime=0, si_stime=0} --- +++ exited with 0 +++ According to POSIX, perl should REALLY be passing a "--" argument between "-c" and the scalar string given by the user; see https://www.austingroupbugs.net/view.php?id=1440 -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Integer overflow of i in string_extract_verbatim
From: Eric Li To: bug-bash@gnu.org Subject: Integer overflow of i in string_extract_verbatim Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -Og uname output: Linux fedora 6.2.12-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 20 23:38:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.2 Patch Level: 15 Release Status: release Description: Bash runs into segmentation fault when spawning a process with argc larger than 2GB. Can debug using GDB and observe that subst.c:1204 (string_extract_verbatim, "while (c = string[i])") crashes because i = -2147483648. string[i] points to invalid memory. Repeat-By: 1. Put the following shell script to a.sh: A='' A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" A="$A$A$A$A" set -o pipefail echo $A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A$A | wc echo $? echo done 2. Run "./bash a.sh" 3. See a.sh: line 15: ... Segmentation fault (core dumped) 4. Use the following command to debug with GDB gdb ./bash --ex 'set follow-fork-mode child' --ex 'r a.sh' 5. See GDB output similar to following: Thread 2.1 "bash" received signal SIGSEGV, Segmentation fault. ... in string_extract_verbatim (...) at subst.c:1204 1204 while (c = string[i]) 6. Using GDB, can see that i = -2147483648. Fix: In string_extract_verbatim, change "int i" to "size_t i". Also need to change other places, including: * Argument sindex of string_extract_verbatim * Variable sindex of get_word_from_string * Variable sindex of get_word_from_string * Argument sindex of string_extract_single_quoted * ...
RFC: changing printf(1) behavior on %b
In today's Austin Group call, we discussed the fact that printf(1) has mandated behavior for %b (escape sequence processing similar to XSI echo) that will eventually conflict with C2x's desire to introduce %b to printf(3) (to produce 0b000... binary literals). For POSIX Issue 8, we plan to mark the current semantics of %b in printf(1) as obsolescent (it would continue to work, because Issue 8 targets C17 where there is no conflict with C2x), but with a Future Directions note that for Issue 9, we could remove %b entirely, or (more likely) make %b output binary literals just like C. But that raises the question of whether the escape-sequence processing semantics of %b should still remain available under the standard, under some other spelling, since relying on XSI echo is still not portable. One of the observations made in the meeting was that currently, both the POSIX spec for printf(1) as seen at [1], and the POSIX and C standard (including the upcoming C2x standard) for printf(3) as seen at [3] state that both the ' and # flag modifiers are currently undefined when applied to %s. [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html "The format operand shall be used as the format string described in XBD File Format Notation[2] with the following exceptions:..." [2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05 "The flag characters and their meanings are: ... # The value shall be converted to an alternative form. For c, d, i, u, and s conversion specifiers, the behavior is undefined. [and no mention of ']" [3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html "The flag characters and their meanings are: ' [CX] [Option Start] (The .) The integer portion of the result of a decimal conversion ( %i, %d, %u, %f, %F, %g, or %G ) shall be formatted with thousands' grouping characters. For other conversions the behavior is undefined. The non-monetary grouping character is used. [Option End] ... # Specifies that the value is to be converted to an alternative form. For o conversion, it shall increase the precision, if and only if necessary, to force the first digit of the result to be a zero (if the value and precision are both 0, a single 0 is printed). For x or X conversion specifiers, a non-zero result shall have 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g, and G conversion specifiers, the result shall always contain a radix character, even if no digits follow the radix character. Without this flag, a radix character appears in the result of these conversions only if a digit follows it. For g and G conversion specifiers, trailing zeros shall not be removed from the result as they normally are. For other conversion specifiers, the behavior is undefined." Thus, it appears that both %#s and %'s are available for use for future standardization. Typing-wise, %#s as a synonym for %b is probably going to be easier (less shell escaping needed). Is there any interest in a patch to coreutils or bash that would add such a synonym, to make it easier to leave that functionality in place for POSIX Issue 9 even when %b is repurposed to align with C2x? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: bug#65659: RFC: changing printf(1) behavior on %b
On Thu, Aug 31, 2023 at 03:10:58PM -0400, Chet Ramey wrote: > On 8/31/23 11:35 AM, Eric Blake wrote: > > In today's Austin Group call, we discussed the fact that printf(1) has > > mandated behavior for %b (escape sequence processing similar to XSI > > echo) that will eventually conflict with C2x's desire to introduce %b > > to printf(3) (to produce 0b000... binary literals). > > > > For POSIX Issue 8, we plan to mark the current semantics of %b in > > printf(1) as obsolescent (it would continue to work, because Issue 8 > > targets C17 where there is no conflict with C2x), but with a Future > > Directions note that for Issue 9, we could remove %b entirely, or > > (more likely) make %b output binary literals just like C. > > I doubt I'd ever remove %b, even in posix mode -- it's already been there > for 25 years. But the longer that printf(3) supports "%b" to output binary values, the more surprised new shell coders will be that printf(1) %b does not behave the same. What's more, other languages have already started using %b for binary output (python, for example), so it is definitely gaining in mindshare. That said, I also agree with your desire to keep the functionality in place. The current POSIX says that %b was added so that on a non-XSI system, you could do: my_echo() { printf %b\\n "$*" } and then call my_echo everywhere that a script used to depend on XSI echo (perhaps by 'alias echo=my_echo' with aliases enabled), for a much quicker portability hack than a tedious search-and-replace of every echo call that requires manual inspection of its arguments for translation of any XSI escape sequences into printf format specifications. In particular, code like [var='...\c'; echo "$var"] cannot be changed to use printf by a mere s/echo/printf %s\\n/. Thus, when printf was invented and standardized for the shell, the solution at the time was to create [printf %b\\n "$var"] as a drop-in replacement for XSI [echo "$var"], even for platforms without XSI echo. Nowadays, I personally have not seen very many scripts like this in the wild (for example, autoconf scripts prefer to directly use printf, rather than trying to shoe-horn behavior into echo). But assuming such legacy scripts still exist, it is still much easier to rewrite just the my_echo wrapper to now use %#s\\n instead of %b\\n, than it would be to find every callsite of my_echo. Bash already has shopt -s xpg_echo; I could easily see this being a case where you toggle between the old or new behavior of %b (while keeping %#s always at the old behavior) by either this or some other shopt in bash, so that newer script writers that want binary output for %b can do so with one setting, while scripts that must continue to run under old semantics can likewise do so. > > > But that > > raises the question of whether the escape-sequence processing > > semantics of %b should still remain available under the standard, > > under some other spelling, since relying on XSI echo is still not > > portable. > > > > One of the observations made in the meeting was that currently, both > > the POSIX spec for printf(1) as seen at [1], and the POSIX and C > > standard (including the upcoming C2x standard) for printf(3) as seen > > at [3] state that both the ' and # flag modifiers are currently > > undefined when applied to %s. > > Neither one is a very good choice, but `#' is the better one. It at least > has a passing resemblence to the desired functionality. Indeed, that's what the Austin Group settled on today after I first wrote my initial email, and what I wrote up in a patch to GNU Coreutils (https://debbugs.gnu.org/65659) > > Why not standardize another character, like %B? I suppose I'll have to look > at the etherpad for the discussion. I think that came up on the mailing > list, but I can't remember the details. Yes, https://austingroupbugs.net/view.php?id=1771 has a good discussion of the various ideas. %B is out for the same reason as %b: although the current C2x draft wording says that % is reserved for implementation use, other than [AEFGX] which already have a history of use by C (as it was, when C99 added %A, that caused problems for some folks), it goes on to _highly_ encourage any implementation that adds %b for "0b0" binary output also add %B for "0B0" binary output (to match the x/X dichotomy). Burning %B to retain the old behavior while repurposing %b to output lower-case binary values is thus a non-starter, while burning %#s (which C says is undefined) felt nicer. The Austin Group also felt that standardizing bash's behavior of %q/%Q for outputting quoted text, while too late for Issue 8, has a good chance of success, even though
Re: [PATCH] printf: add %#s alias to %b
On Thu, Aug 31, 2023 at 04:01:17PM -0500, Rob Landley wrote: > On 8/31/23 13:31, Eric Blake wrote: > > POSIX Issue 8 will be obsoleting %b (escape sequence interpolation) so > > that future Issue 9 can change to having %b (binary literal output) > > that aligns with C2x. > > I.E. you sent an RFC to that effect to the posix list earlier today, and so > far > the only reply on the posix list was the bash maintainer, who said "I doubt > I'd > ever remove %b, even in posix mode -- it's already been there for 25 years." The RFC to the POSIX list was started earlier than today (https://austingroupbugs.net/view.php?id=1771 was filed on Aug 7, not by me; and by Aug 8 we had already identified the future conflict with C2x %b). But you are right that today was the first time I widened the audience by mailing coreutils and bash (rather than just the few developers that follow the POSIX mailing list). There are also plans to ask the same question of other shell developers (dash, BSD, busybox, ...); but I figured I'd start with the people and code I know best. It's not hard to see why POSIX is choosing to have Issue 8 obsoleting (not removing) %b's old semantics; in the short term, nothing about %b changes, so your dusty-deck shell scripts will continue to work as they have before; but you now have enough time to update your scripts. The question is whether Issue 9 (several years down the road) will be able to repurpose %b to mean binary literal output (only possible if all shell authors agree that C2X compatibility is worth it), or must instead just undocument %b altogether (shells can always provide extensions that POSIX doesn't document - and the obvious extensions in that case would be a shell's choice of %b with the old semantics or %b to do binary literals). But if POSIX _is_ able to repurpose %b (because enough shell authors agree that binary output is more useful these days than XSI echo compatibility), the followon question is whether there should be a portable way to access the old functionality. Since %#s is currently unspecified, we are trying to guage feedback of how many implementations are willing to add that alias now, which in turn will affect whether Issue 9 can mandate that behavior (because everyone liked it) or must continue to leave it undefined. But nothing is stopping coreutils from adding %#s as an extension now, regardless of what input other shell authors provide to the ongoing POSIX discussion. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: bug#65659: RFC: changing printf(1) behavior on %b
On Fri, Sep 01, 2023 at 08:59:19AM +0100, Stephane Chazelas wrote: > 2023-08-31 15:02:22 -0500, Eric Blake via austin-group-l at The Open Group: > [...] > > The current POSIX says that %b was added so that on a non-XSI > > system, you could do: > > > > my_echo() { > > printf %b\\n "$*" > > } > > That is dependant on the current value of $IFS. You'd need: > > xsi_echo() ( > IFS=' ' > printf '%b\n' "$*" > ) Let's read the standard in context (Issue 8 draft 3 page 2793 line 92595): " The printf utility can be used portably to emulate any of the traditional behaviors of the echo utility as follows (assuming that IFS has its standard value or is unset): • The historic System V echo and the requirements on XSI implementations in this volume of POSIX.1-202x are equivalent to: printf "%b\n" "$*" " So yes, the standard does mention the requirement to have a sane IFS, and I failed to include that in my one-off implementation of my_echo(). Thank you for pointing out a more robust version. > > Or the other alternatives listed at > https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo/65819#65819 > > [...] > > Bash already has shopt -s xpg_echo > > Note that in bash, you need both > > shopt -s xpg_echo > set -o posix > > To get a XSI echo. Without the latter, options are still > recognised. You can get a XSI echo without those options with: > > xsi_echo() { > local IFS=' ' - > set +o posix > echo -e "$*\n\c" > } > > The addition of those \n\c (noop) avoids arguments being treated as > options if they start with -. As an extension, Bash (and Coreutils) happen to honor \c always, and not just for %b. But POSIX only requires \c handling for %b. And while Issue 8 has taken steps to allow implementations to support 'echo -e', it is still not standardized behavior; so your xsi_echo() is bash-specific (which is not necessarily a problem, as long as you are aware it is not portable). > [...] > > The Austin Group also felt that standardizing bash's behavior of %q/%Q > > for outputting quoted text, while too late for Issue 8, has a good > > chance of success, even though C says %q is reserved for > > standardization by C. Our reasoning there is that lots of libc over > > the years have used %qi as a synonym for %lli, and C would be foolish > > to burn %q for anything that does not match those semantics at the C > > language level; which means it will likely never be claimed by C and > > thus free for use by shell in the way that bash has already done. > [...] > > Note that %q is from ksh93, not bash and is not portable across > implementations and with most including bash's gives an output > that is not safe for reinput in arbitrary locales (as it uses > $'...' in some cases), not sure it's a good idea to add it to > the standard, or at least it should come with fat warnings about > the risk in using it. %q is NOT being added to Issue 8, but $'...' is. Bug 1771 asked if %q could be added to Issue 8, but it came it past the deadline for feature requests, so the best we could do is add a FUTURE DIRECTIONS blurb that mentions the idea. But since FUTURE DIRECTIONS is non-normative, we can always change our mind in Issue 9 and delete that text if it turns out we can't get consensus to standardize some form of %q/%Q after all. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: bug#65659: RFC: changing printf(1) behavior on %b
On Fri, Sep 01, 2023 at 07:19:13AM +0200, Phi Debian wrote: > Well after reading yet another thread regarding libc_printf() I got to > admit that even %B is crossed out, (Yet already choosen by ksh93) > > The other thread also speak about libc_printf() documentting %# as > undefined for things other than a, A, e, E, f, F, g, and G, yet the same > thread also talk about a A comming late (citing C99) in the dance, meaning > what is undefined today become defined tomorow, so %#b is no safer. > Caution: The proposal here is for %#s (an alternative string), not %#b (which C2x wants to be similar to %#x, in that it outputs a '0b' prefix for all values except bare '0'). Yes, there is a slight risk that C may decide to define %#s. But as the Austin Group includes a member of WG14, we are able to advise the C committee that such an addition is not wise. > My guess is that printf(1) is now doomed to follow its route, keep its old > format exception, and then may be implement something like c_printf like > printf but the format string follow libc semantic, or may be a -C option to > printf(1)... Adding an option to printf is also a possibility, if there is wide-spread implementation practice to standardize. If someone wants to implement 'printf -C' right now, that could help feed such a future standardization. But it is somewhat orthogonal to the request in this thread, which is how to allow users to still access the old %b behavior even if %b gets repurposed in the future; if we can get multiple implementations to add a %#s alias now, it makes the future decisions easier (even if it is too late for Issue 8 to add any new features, or for that matter, to make any normative changes other than marking %b obsolescent as a way to be able to revisit it in the future for Issue 9). > > Well in all case %b can not change semantic in the bash script, since it is > there for so long, even if it depart from python, perl, libc, it is > unfortunate but that's the way it is, nobody want a semantic change, and on > next routers update, see the all internet falling appart :-) How many scripts in the wild actually use %b, though? And if there are such scripts, anything we can do to make it easy to do a drop-in replacement that still preserves the old behavior (such as changing %b to %#s) is going to be easier to audit than the only other currently-portable alternative of actually analyzing the string to see if it uses any octal or \c escapes that have to be re-written to portably function as a printf format argument. POSIX is not mandating %#s at this time, so much as suggesting that if implementations are willing to implement it now, it will make Issue 9 easier to reason about. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] printf: add %#s alias to %b
On Wed, Sep 06, 2023 at 09:03:29AM -0400, Chet Ramey wrote: > On 9/5/23 10:13 PM, William Bader wrote: > > Has bash ever had a change before that would break valid scripts? > > Yes, but I try to keep those to a minimum. > > > Could the printf format change be settable by a variable or by an option > > like the -e/-E in echo? > > It could, sure. Anything is possible. > > > Is it necessary for bash printf to match C printf? > > No. That's the heart of it. > > > I suppose that it is already decided. > > The austin group has decided what they'd like to do, and what they'd like > implementors to do. The question is whether or not people go along with it. The Austin Group decided merely: If we do nothing now for Issue 8, then Issue 9 WILL have a conflict between printf(1) and printf(3). If we reach out to all developers now, we can start the discussion, and then by the time Issue 9 comes around (several years from now), we may have enough consensus to do any number of things: - Do nothing; printf(1) and printf(3) have incompatible %b - Declare that %b has implementation-defined behavior (shell authors have the choice on whether %b has old or new behavior) - Declare that %b is no longer standardized (but implementations can still provide it as an extension, using their choice of behavior) - Standardize %#s to do the same thing as %b used to do - Standardize 'printf -c %b 1' which parses its format string according to C23 rules (output "0b1"), while 'printf %b 1' remains the old way (output "1") - Your suggestion here (if enough shell writers agree on what to do, then Issue 9 can do that) But for that work, Issue 8 has to do something - it marks %b obsolescent, merely so that we have the option (not the mandate) to change its behavior in the future. It may turn out that there is enough resistance that the answer is no change to behavior, and we could even remove the obsolescent tag in Issue 9 (that is, make it formal that printf(1) and printf(3) intentionally diverge on %b). But marking something obsolescent in Issue 8 doesn't require any current shell to change, while still encouraging the discussion in case they do want to change. Adding %#s as a synonym to %b seems easy enough to do, regardless of what Issue 9 decides to do to %b, so the Austin Group mentioned that as a non-normative idea in the wording for Issue 8. But they are not requiring shell authors to implement it (even though GNU Coreutils has already expressed willingness to do it in /bin/printf). Meanwhile, implementing 'printf -c' to mean "interpret my format string according to C23 semantics" is also a viable idea, but one not mentioned in the current incantation of the Austin Group bug. But that's why the bug has a 30-day review period, to collect feedback comments on how it can be better worded before Issue 8 is finalized. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] printf: add %#s alias to %b
On Wed, Sep 06, 2023 at 10:45:09AM +0700, Robert Elz wrote: > > However, my "read of the room" at the minute is that this simply won't > happen, and printf(1) %b will remain as it is, and not be removed any > time soon (or probably, ever). If printf(1) ever really needs a method > to output in binary, some other mechanism is likely to be found - most > likely one which gives a choice of output bases, not just base 2. You (anyone reading this, not just kre) are welcome to join tomorrow's Austin Group meeting if you would like to add input on how to reword the changes that will land in Issue 8 as a result of https://austingroupbugs.net/view.php?id=1771; it is a Zoom call (details at https://www.mail-archive.com/austin-group-l@opengroup.org/msg11758.html). Or you can add comments to the bug directly. I will be on the call, and if nothing else, my role in the meeting tomorrow will include summarizing some of the "read of the room" on the feedback received in this thread (namely, enough shell authors are insistent that printf(1) and printf(3) should diverge in %b behavior in Issue 9 that trying to plan otherwise by marking it obsolescent in Issue 8 isn't going to minimize any pain down the road) > > There's no current harm implementing %#s as an alias for %b - but I see > no point in anyone using it, it will certainly be far less portable than > %b for a LONG time. There's also no guarantee that the C people might not > find a use in printf(3) for %#s (currently the # there has no meaning) and > entirely possible that whatever that use is, if it happens, might be more > useful for printf(1) to follow, than having it mean what %b currently > means - so going that route really is not a safe long term choice (it > would be a gamble). Of course, the gamble is easier to win if we have multiple independent implementations that have all coordinated to do it the same way, so we can push back on WG14 to tell them they would be foolish to commandeer %#s for anything other than what existing practice has. Coreutils is willing to do it, but I have not actually committed that patch yet, waiting to see how this thread pans out. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] printf: add %#s alias to %b
On Thu, Sep 07, 2023 at 11:53:54PM +0700, Robert Elz wrote: > And for those who have been following this issue, the new text for > the forthcoming POSIX version has removed any mention of obsoleting > %b from printf(1) - instead it will simply note that there is will be > a difference between printf(1) and printf(3) once the latter gets its > version of %b specified (in C23, and in POSIX, in the next major version > that follows the coming one, almost certainly) - and to encourage > implementors to consider possible solutions. > > I've considered, and I don't see a problem needing solving, so I'm > intending to do precisely nothing, unless someone actually finds a > need for binary output from printf(1), which seems unlikely to > ever happen to me (I mean a real need, not just to be the same as printf(3) > "just because"). > > So, we can all go back to sleep now - and Chet, I'd undo $#s before it > appears in a release, there's no need, and having it might eventually > just cause more backward compat issues. Indeed, at this point, even though I proposed a patch for %#s in coreutils, I'm inclined to NOT apply it there. The ksh extension of %..2d to output in binary does sound worth replicating; I wonder if glibc would consider putting that in their printf(3); and I could see adding it to Coreutils (whether or not bash adds it - because ksh already has it). And thanks for pointing out the existing discrepancy with %c; that was really helpful in today's Austin Group meeting in realizing that conflicts in Issue 9 regarding %b is not covering new ground. > > And wrt: > | I don't know what potential uppercase/lowercase pairs of format specifiers > | are free from use in any existing POSIX-like shell, but my suggestion > would > > There are none, printf(3) belongs to the C committee, and they can make > use of anything they like, at any time they like. > > The best we can do is use formats that make no sense for printf(1) to > support (like %p, which in printf(3) prints a pointer value, but in > printf(1) there are no (meaningful) pointers that it could ever make > sense to print, so %p is useless for its printf(3) purpose in printf(1). > > Similarly all the size modifier chars are meaningless for printf(1), as > all the numeric values it is passed are actually strings - what internal > format they're converted into is unrelated to anything the printf(1) user > can control, so none of those size modifiers mean anything to printf(1) > either (but it seems that many of those have been usurped by various > printf(1) implementations already, so finding something free that everyone > could use, isn't easy). Here, I slightly disagree with you. Right now, both bash and coreutils' 'printf %hhd 257' outputs "257", but printf("%hhd", 257) in C outputs 1. I would LOVE to have a mode (possibly spelled 'printf -C %hhd 257') where I can ensure that width modifiers are applied to the integer value obtained from the correct subsequent argument to printf. [Side note: since bash also supports 'printf a%n b >/dev/null' as a convoluted way of accomplishing 'b=1', I wonder if it would be possible to port https://github.com/carlini/printf-tac-toe which performs an interactive game of tic-tac-toe in a single printf(3) statement invoked in a while loop into a single printf(1) command line invocation. The lack of %hhd implicitly masking with 256 makes it harder] That is, if we are thinking about adding 'printf -c' or 'printf -C' as a way to say "treat my format string as closely to C as possible", we would be addressing MULTIPLE things at once: %b, %c, %hhd, and any other (useful) conversion specifier in C. And given that, I prefer naming such an extension option -C or -c (implying C-like), rather than your suggestion of -b (implying binary, but where the implication only benefits %b) as a better option name for such a printf extension option. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] printf: add %#s alias to %b
On Thu, Sep 07, 2023 at 02:42:16PM +0700, Robert Elz wrote: > Date:Wed, 6 Sep 2023 11:32:32 -0500 > From: Eric Blake > Message-ID: > > > | You (anyone reading this, not just kre) are welcome to join tomorrow's > | Austin Group meeting > > Thanks, but I don't expect its time of day will coincide with > mine this week, at best I would be a half asleep zombie. > > | it is a Zoom call > > As best I understand it, zoom does not support NetBSD - which > is the only platform I use, which has been true for decades now > (previously I also used SunOS (not Solaris) and Ultrix). > > While probably works on android (ie: phone) meeting use that > way would not be convenient for anyone - certainly not for me > staring at it all the time, and assuming that it works with > video enabled, not for anyone else with an image moving around > randomly... (my phone has no stand, I haven't been able to > find one which fits it). The meeting is now over, but for clarification, the Austin Group does audio-only meetings. Some weeks we use Zoom, some we use Webex (depends on who is available to run the meeting), but no one is on-screen, so a POT dialin always works at no disadvantage to someone unable/unwilling to run Zoom software (whether that be for reasons of not yet having port available, or for Zoom not releasing their software under a license acceptable to your liking). Speak up if you think the Austin Group is ever unfairly crippling someone's right to participate by limiting the participation behind a paywall. > > | Or you can add comments to the bug directly. > > I have done that already, and probably will add one more. > > | Of course, the gamble is easier to win if we have multiple independent > | implementations that have all coordinated to do it the same way, so we > | can push back on WG14 to tell them they would be foolish to commandeer > | %#s for anything other than what existing practice has. > > Which worked how well with %b ? As Geoff commented on 1771, if someone had raised the issue about %b conflicting 6 months sooner, and pointed out the ksh extension of %..d as an alternative, we may have had time to do so. https://austingroupbugs.net/view.php?id=1771#c6453 But because the Austin Group learned about the conflict so late in the game, we were already too late to push back on C2x at the time, putting us instead into the camp of seeing what consensus we could get from shell developers. This thread (and others like it) have been helpful - we DID get consensus (namely, that printf(1) and printf(3) have always diverged, so diverging further on %b is okay), and today's Austin Group meeting we updated what will go into Issue 8 based on that feedback. I consider that to be a successful outcome, even if you may have felt heartburn through the intermediate stages of it all. > > Further, upon reflection, I think a better use of %#s in printf(1) > (no point in printf(3)) would be to explicity output a string of > bytes (what %s used to do, before it was reinterpreted to output > characters instead). While the two might seem to be mostly the > same, that depends upon the implementation - if an implementation > treats strings of characters as arrays of wchar_t, and converts > from byte encoding to wchar_t on input, there's no guarantee that > the output (converted back from wchar_t to byte encoding) will be > identical to the input string. Sometimes that might not be > desirable and a method to simply copy the input string to the > output, as uninterpreted bytes might be useful to have. To me > that is a better use of %#s than as a %b clone - particularly > as %b needs the same kind of variant (%#b). This also deals > with the precision issue, %.1s is 1 character fr9m the arg > string, %#.1s is one byte instead. That is indeed a cool idea, but one for the libc folks to take up. At any rate, I agree that burning %#s to be a synonym for %b precludes this useful idea (and it may be even more important in shell contexts, now that Issue 8 has taken efforts to make it clear that sometimes the shell deals with characters, and sometimes with bytes; in particular, environment variables can hold bytes that need not always form characters in the current locale). > > If there were to be anything worthy of announcing as deprecated > from posix printf(1) it would be %c - then we could make %c be > compat with its printf(3) meaning, where it takes a codepoint > as an int (just 8 bits in printf(3) but we don't neet to retain > that restriction) and outputs the associated character, rather > than just being an (almost) alias for %.1s -- where the almost > is because given '' as the arg string, %c is permitted to output > \0 or nothing, wher
Re: Idea: jobs(1) -i to print only :%ID:s
On Fri, Nov 10, 2023 at 01:22:54PM -0500, Greg Wooledge wrote: > It most definitely is *not* everywhere. It's part of GNU coreutils, > and is generally not present on any system that does't use those (BSDs > and commercial Unixes for example). >From _seq(1)_ on FreeBSD: > The seq command first appeared in Version 8 AT&T UNIX. A seq command > appeared in NetBSD 3.0, and was ported to FreeBSD 9.0. This command > was based on the command of the same name in Plan 9 from Bell Labs and > the GNU core utilities. The GNU seq command first appeared in the 1.13 > shell utilities release. >From _seq(1)_ on OpenBsd: > A seq command appeared in Version 8 AT&T UNIX. This version of seq > appeared in NetBSD 3.0 and was ported to OpenBSD 7.1.
Re: Bash Bug - Incorrect Printing of Escaped Characters
On Mon, Dec 25, 2023 at 05:00:37PM -0500, Seth Sabar wrote: > I'm reaching out to report what I believe to be a bug with the > *--pretty-print* feature in bash-5.2. Tangentially, this option doesn't seem to be documented outside of "bash --help": $ git clone https://git.savannah.gnu.org/git/bash.git Cloning into 'bash'... remote: Counting objects: 41221, done. remote: Compressing objects: 100% (5024/5024), done. remote: Total 41221 (delta 36225), reused 41045 (delta 36106) Receiving objects: 100% (41221/41221), 259.98 MiB | 15.65 MiB/s, done. Resolving deltas: 100% (36225/36225), done. $ cd bash/doc/ doc$ fgrep -r pretty texinfo.tex:% above. But it's pretty close. texinfo.tex: % and a tt hyphen is pretty tiny. @code also disables ?` !`. doc$ Eric
completion very slow with gigantic list
Hi, I noticed bash struggles with gigantic completion lists (100k items of ~70 chars each) It's reproducible with both LANG+LC_ALL set to en_US.UTF-8 and C, so it's not just locales slowing things down. This happens on the up-to-date `devel' branch (commit 584a2b4c9e11bd713030916d9d832602891733d7), but I first noticed this on Debian oldstable (5.1.4) strcoll and strlen seem to be at the top of profiles, and mregister_free when building devel with default options... ltrace reveals it's doing strlen repeatedly on the entire (100k items * 70 chars each = ~7MB) Sidenote: I'm not really sure what one would do with ~100K completion candidates, but I managed to hit that case when attempting completion for an NNTP group + IMAP mailbox listing. Standalone reproducer here: ---8<-- # bash struggles with giant completion list (100K items of ~70 chars each) # Usage: # . giant_complete.bash # giant_complete a # watch CPU usage spike # # derived from lei-completion.bash in https://80x24.org/public-inbox.git # There could be something wrong in my code, too, since I'm not # familiar with writing completions... _giant_complete() { # generate a giant list: local wordlist="$(awk
Re: completion very slow with gigantic list
"Dale R. Worley" wrote: > A priori, it isn't surprising. But the question becomes "What > algorithmic improvement to bash would make this work faster?" and then > "Who will write this code?" I'll try to take a look at it in a few months if I run out of things to do and nobody beats me to it. I've already got a lot on my plate and hit this on my way to other things.
static vs. dynamic scoping
On the Austin Group mailing list, David Korn (of ksh93 fame) complained[1] that bash's 'local' uses dynamic scoping, but that ksh's 'typeset' uses static scoping, and argued that static scoping is saner since it matches the behavior of declarative languages like C and Java (dynamic scoping mainly matters in functional languages like lisp): [1] https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tpl&source=L&listname=austin-group-l&id=14951 I'm trying to standardize the notion of local variables for the next revision of POSIX, but before I can do so, I need some feedback on two general aspects: 1. Implementation aspect: How hard would it be to add static scoping to bash? Is it something that can be added in addition to dynamic scoping, via the use of an option to select the non-default mode (for example, 'local -d' to force dynamic, 'local -s' to force static, and 'local' to go with default scoping)? If both scoping forms are supported, is it worth making the default scoping dependent on posix compliance (for example, 'local' means dynamic scoping for 'set +o posix' but static scoping for 'set -o posix'), or should it be the same default for both modes? 2. User aspect: Is anyone aware of a script that intentionally uses the full power of dynamic scoping available through 'local' which would break if scoping switched to static? In particular, I know that the bash-completion project has fought with local variable scoping issues; would it help or hurt to switch to static scoping? Here's a sample shell script that illustrates the difference between the two scoping methods. $ ksh -c 'function f1 { typeset a=local; f2; echo $a; }; function f2 { echo $a; a=changed; }; a=global; f1; echo $a' global local changed $ bash --posix -c 'function f1 { typeset a=local; f2; echo $a; }; function f2 { echo $a; a=changed; }; a=global; f1; echo $a' local changed global In static scoping, function f2 does not shadow a declaration of a, so references to $a within f2 refer to the global variable. The local variable a of f1 can only be accessed within f1; the behavior of f2 is the same no matter how it was reached. In dynamic scoping, function f2 looks up its call stack for the closest enclosing scope of a variable named a, and finds the local one declared in f1. Therefore, the behavior of f2 depends on how f2 is called. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash: Correct usage of F_SETFD
On 11/22/2010 03:16 PM, Chet Ramey wrote: >> include/filecntl.h in bash-4.1 has following: >> >> #define SET_CLOSE_ON_EXEC(fd) (fcntl ((fd), F_SETFD, FD_CLOEXEC)) >> >> Is that really the correct/intended usage of F_SETFD ? > > F_SETFDSet the close-on-exec flag associated with fildes to > the low order bit of arg (0 or 1 as above). > >> If kernel ever adds a new flag to the fd, this would end up clearing the >> other new flag right ? >> >> Shouldn't bash use F_GETFD to get the current flags and set/clear just >> the FD_CLOEXEC bit ? > > I suppose it would matter if there are systems that have more than one > flag value. In practice, there aren't any such systems; but POSIX warns that current practice is no indicator of future systems, and that read-modify-write is the only way to use F_SETFD. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash: Correct usage of F_SETFD
On 11/23/2010 07:42 AM, Matthew Wilcox wrote: > The POSIX definition can be found here: > http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html > >> | In practice, there aren't any such systems; but POSIX warns that current >> | practice is no indicator of future systems, and that read-modify-write >> | is the only way to use F_SETFD. >> >> Yes, that seems to make more sense. > > I think future flags will be created such that they default to off, > and bash would have to affirmitively set them in order to use them. Not true. An implementation can reasonably define a new flag to off for backwards-compatible behavior, and on for POSIX-compatible behavior, if there is a case where traditional and POSIX behavior differ. POSIX permits additional bits to be on, and in fact requires that applications leave those additional bits unchanged, in the very case where those additional bits are essential for maintaining a POSIX-compatible environment. > > So if bash is the one creating its file descriptors, there's no need to > use R/M/W since it knows what the state of them are. No, bash cannot reasonably know what the implementation's default bit state is, and blindly setting all other bits to zero is very possibly a bug, and easy enough to avoid by using the full R/M/W. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Why `echo -n hello | while read v; do echo $v; done' prints nothing?
On 12/02/2010 04:04 AM, Clark J. Wang wrote: > Following command also prints nothing, confused :( > > for ((i = 0; i < 10; ++i)); do echo -n " $i"; done | while read v; do echo > $v; done http://www.faqs.org/faqs/unix-faq/shell/bash/ FAQ E4. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Why `echo -n hello | while read v; do echo $v; done' prints nothing?
On 12/02/2010 07:02 PM, Clark J. Wang wrote: >> The output from the first command in the pipeline does not end with a >> newline. Therefore, 'read' in the second command returns 'failure' >> (non-zero) when it reads the first line of input, and your loop never >> iterates. > > But is that reasonable? I think read should return success in this case > which makes more sense to me. Does the POSIX standards require that? POSIX requires that the input to read be a text file (and by the definition of text file in POSIX, it must either be empty or end in a newline). By violating POSIX and passing something that does not end in a newline, you are no longer bound by the rules of POSIX. Therefore, it would be a reasonable bash extension that read could return 0 status if it read data that did not end in a newline, but it would not be a standard-compliant script that relied on such an extension. You're better off supplying the trailing newline, and guaranteeing a compliant usage. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: argument precedence, output redirection
On 12/03/2010 07:46 AM, Payam Poursaied wrote: > > Hi all, > I'm not sure this is a bug or please let me know the concept: > What is the difference between: > ls -R /etc/ 2>&1 1>/dev/null > and > ls -R /etc/ 1>/dev/null 2>&1 POSIX requires that redirections are evaluated from left to right. The first line duplicates fd 2 from 1 (that is, stderr is now shared with stdout), then changes fd 1 onto /dev/null (so you've silenced stdout, and errors from ls will show up on your stderr). The second line changes fd 1 onto /dev/null, then duplicates fd 2 from 1 (that is, stderr is now shared with /dev/null, and you've silenced all output to either stream). > the second one redirect everything to /dev/null but the first one, still > prints errors (run as a non root user would unveil the problem) > it the order of arguments important? If yes, what is the idea/concept behind > this behavior? Yes the order is important, and the idea behind the behavior is that left-to-right evaluation order can be easily documented and relied on. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Consume only up to 8 bit octal input for backslash-escaped chars (echo, printf)
[adding the Austin Group] On 12/07/2010 06:19 PM, Chet Ramey wrote: > On 12/7/10 11:12 AM, Roman Rakus wrote: >> This one is already reported on coreutils: >> http://debbugs.gnu.org/cgi/bugreport.cgi?msg=2;bug=7574 >> >> The problem is with numbers higher than /0377; echo and printf consumes all >> 3 numbers, but it is not 8-bit number. For example: >> $ echo -e '\0610'; printf '\610 %b\n' '\610 \0610' >> Should output: >> 10 >> 10 10 10 >> instead of >> � >> � � � > > No, it shouldn't. This is a terrible idea. All other shells I tested > behave as bash does*, bash behaves as Posix specifies, and the bash > behavior is how C character constants work. Why would I change this? > > (*That is, consume up to three octal digits and mask off all but the lower > 8 bits of the result.) POSIX states for echo: "\0num Write an 8-bit value that is the zero, one, two, or three-digit octal number num." It does not explicitly say what happens if a three-digit octal number is not an 8-bit value, so it is debatable whether the standard requires at most an 8-bit value (two characters, \0061 followed by 0) or whether the overflow is silently ignored (treated as one character \0210), or some other treatment. The C99 standard states (at least in 6.4.4.4 of the draft N1256 document): "The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined." leaving '\610' as an implementation-defined character constant. The Java language specifically requires "\610" to parse as "\061" followed by "0", and this can be a very useful property to rely on in this day and age where 8-bit bytes are prevalent. http://austingroupbugs.net/view.php?id=249 is standardizing $'' in the shell, and also states: "\XXX yields the byte whose value is the octal value XXX (one to three octal digits)" and while it is explicit that $'\xabc' is undefined (as to whether it maps to $'\xab'c or to $'\u0abc' or to something else), it does not have any language talking about what happens when an octal escape does not fit in a byte. Personally, I would love it if octal escapes were required to stop parsing after two digits if the first digit is > 3, but given that C99 leaves it implementation defined, I think we need a POSIX interpretation to resolve the issue. Also, I think this report means that we need to tweak the wording of bug 249 (adding $'') to deal with the case of an octal escape where three octal digits do not fit in 8 bits (either by explicitly declaring it unspecified, as is the case with \x escapes; or by requiring implementation-defined behavior, as in C99; or by requiring explicit end-of-escape after two digits, as in Java). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: cd with multiple arguments?
On 09/23/2010 10:47 AM, Keith Thompson wrote: > I'm not sure whether this is a bug (the documentation doesn't address > this case), but it's at least mildly annoying. > > If you invoke the "cd" commands with extra arguments after the directory > name, all the extra arguments are silently ignored. I would really love it if bash copied the zsh behavior here: cd a b tries to find a directory matching the extglob a+([$' \t'])b and fails if there are multiple matches or no matches. Certainly less typing than remembering to quote the argument, via cd 'a b'. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash 'let' can give error
On 12/09/2010 10:52 AM, Dominic Raferd wrote: > Description: > > $ val=0; let val++; echo $val,$?; unset val > 1,1 Not a bug. > > see the error code 1. Setting any other start value (except undefined) > for val does not produce this error, the problem occurs for let val++ > and let val-- if the start value is 0. let intentionally returns status 1 if the value was 0; and status > 1 if there was an error. Why? So you can do loops such as: countdown=10 while let countdown--; do ... ; done > Why does this happen? Is it 'by design'? Yes. The same as for 'expr' which is standardized by POSIX to have the same behavior. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash 'let' can give error
On 12/10/2010 08:49 AM, Marc Herbert wrote: >> let intentionally returns status 1 if the value was 0; and status > 1 if >> there was an error. Why? So you can do loops such as: >> >> countdown=10 >> while let countdown--; do ... ; done >> >>> Why does this happen? Is it 'by design'? >> >> Yes. The same as for 'expr' which is standardized by POSIX to have the >> same behavior. > > This is a design mistake: it trades a few characters for a lot of confusion. It's required for 'expr'. But since 'let' is a bash extension, bash is free to change the semantics of 'let' to behave differently. However, doing so now would break backwards compatibility with existing scripts that have come to depend on this behavior, so unfortunately we're stuck with it. Remember, non-zero status is NOT always 'failure'; it is the documentation of each command that will tell you which status values imply failure. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Precedence of "C operators" and "bash operators"
On 12/18/2010 09:22 AM, 12bric wrote: > > Bash 4-1 manual indicates > - "The operators and their precedence, associativity, and values are > the same as in the C language" (pp. 27). > - the precedence of - and + operators is different then the precedence > of ! and ~. > > But in the book "C a reference manual" (Harbison & Steele), the four > operators + - ! ~ have the same precedence. The unary operators + and - have the same precedence as the unary ! and ~; and that level of precedence is higher than that of the binary + and -. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: exit status question
On 12/20/2010 11:25 AM, Curtis Doty wrote: > Not exactly sure if this is a bug. But I don't understand why only the > first time running ((i++)) returns an error exit status. Because it follows the same semantics as 'expr', where status 1 is reserved for a successful run with value 0, and status 2 and above are reserved for errors. This was just brought up on the list earlier this month: http://lists.gnu.org/archive/html/bug-bash/2010-12/msg00087.html and seems to be a recurring question: http://lists.gnu.org/archive/html/bug-bash/2010-07/msg00121.html -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: read builtin does not return success code (0) in spite of it successfully read line
On 12/26/2010 01:29 PM, Stephane CHAZELAS wrote: > Bash behavior is the same as every other shell, is as documented > and as specified by POSIX. POSIX requires that the input to read be a text file. Since you aren't passing a text file, the behavior is undefined. POSIX does NOT require bash to return failure in this case, but neither does it require bash to return success. You should not rely on the behavior of read when the input does not end in a newline. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: read builtin does not return success code (0) in spite of it successfully read line
On 12/27/2010 10:59 AM, Stephane CHAZELAS wrote: > 2010-12-27, 09:43(-07), Eric Blake: > [...] >> On 12/26/2010 01:29 PM, Stephane CHAZELAS wrote: >>> Bash behavior is the same as every other shell, is as documented >>> and as specified by POSIX. >> >> POSIX requires that the input to read be a text file. Since you aren't >> passing a text file, the behavior is undefined. POSIX does NOT require >> bash to return failure in this case, but neither does it require bash to >> return success. You should not rely on the behavior of read when the >> input does not end in a newline. > [...] > > From > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html > > SUSv4> EXIT STATUS > SUSv4> > SUSv4> The following exit values shall be returned: > SUSv4> > SUSv4> 0 > SUSv4> Successful completion. > SUSv4> >0 > SUSv4> End-of-file was detected or an error occurred. > SUSv4> Also from the standard: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html STDIN The standard input shall be a text file. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html 1.4 Utility Description Defaults When an input file is described as a "text file", the utility produces undefined results if given input that is not from a text file, unless otherwise stated. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html 3.395 Text File A file that contains characters organized into zero or more lines. The lines do not contain NUL characters and none can exceed {LINE_MAX} bytes in length, including the character. > > So I think you can expect a non-zero status here. All you can expect is that you have undefined results. Undefined results include zero status. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Command substitution (backtick) and tab completion
On 12/31/2010 09:49 AM, Chris F.A. Johnson wrote: > Fri, 31 Dec 2010 11:49:26 -0500 (EST) linux system >>> with default installation settings. >>> >>> In an earlier version of bash (3.2), the following works >>> ls `pwd`/ (expands pwd). >>> >>> In bash 4.1 it does not. Am I missing a setting or something? >>> Thank you. >> >> Anybody that can at least verify this? Thanks. > >Yes, it works in 3.2 but not in 4.[012]. Personally, I find bash 3.2 behavior buggy - I _don't_ want `` (or $()) expanded by the mere act of tab-completion, as the command may have side effects that I am unwilling to have happen more than once when I hit the final Enter to accept the entire command line. Yes, that means that tab-completion is inherently limited when it cannot determine the context of which directory to look in because the left-hand side of a path name is hidden by a command substitution, but I'd rather live with that safety than have arbitrary commands running during completion. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: read builtin and readonly variables
On 01/03/2011 11:41 PM, Jan Schampera wrote: > Hello list, > > > the read builtin command, when trying to assign to a readonly variable > after reading the data, spits an error message. This is fine. > > But the return status is 0. It "always" (down to 2.04 was tested) has > been like that, and it's like that in upcoming 4.2. > > For me, this doesn't make sense. The read may have been successful, but > the data is gone. It would make sense to return !=0 here, IMHO. getopts also suffers from a difference in behavior between shells on readonly arguments: $ ksh -c 'readonly foo; getopts a: foo -a blah; echo $?' ksh[1]: ksh: foo: is read only $ echo $? 2 $ bash -c 'readonly foo; getopts a: foo -a blah; echo $?' bash: foo: readonly variable 1 where non-interactive ksh completely exited on an invalid assignment, but bash merely set $?. > > I also quickly cross-read POSIX, since such weirdness usually comes from > there ;-) but I didn't see anything obvious. I couldn't find anything either - the POSIX wording for readonly only mentions assignment and unset as requiring errors. I think that's an unintentional hole in POSIX, though, so I'm going ahead and submitting a bug report to have readonly also mention read and getopts as being required to error out on a readonly variable (and given that ksh treats assignment different than unset on whether a non-interactive shell exits, the extent of the reaction for getopts and read will probably have to allow both behaviors). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: read builtin and readonly variables
[adding David Korn for a ksh bug] On 01/04/2011 08:25 AM, Chet Ramey wrote: >> getopts also suffers from a difference in behavior between shells on >> readonly arguments: >> >> $ ksh -c 'readonly foo; getopts a: foo -a blah; echo $?' >> ksh[1]: ksh: foo: is read only >> $ echo $? >> 2 >> $ bash -c 'readonly foo; getopts a: foo -a blah; echo $?' >> bash: foo: readonly variable >> 1 >> >> where non-interactive ksh completely exited on an invalid assignment, >> but bash merely set $?. > > The shell should not exit on an assignment error with getopts, since > getopts is not a special builtin. Good point - 'unset' is different than 'getopts' or 'read' when it comes to special builtin status, and I agree that only special builtins are allowed to exit a non-interactive shell on an assignment error (POSIX XBD 2.8.1 Consequences of Shell Errors). Even worse, neither ksh nor bash exit the shell on 'readonly foo; unset foo; echo $?', so ksh's behavior on 'getopts' seems like a ksh bug. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Command substitution (backtick) and tab completion
On 01/05/2011 09:57 AM, Greg Wooledge wrote: > On Wed, Jan 05, 2011 at 08:21:18AM -0800, chengiz wrote: >> So if I get this right, the only time this is a problem is when the >> command substitution runs more than once. > > I'd actually characterize it differently: it's unsafe to run arbitrary > commands during tab completion, because bash doesn't know what those > commands might do. > >> When does this happen? Not >> in my "ls `pwd`/" example where the command runs once and >> replaces itself with its output. Does it only run more than once when >> the ticks are not complete? > > You might realize you made a mistake, hit Ctrl-U, and start over. But > the backticked command has already been executed. > > You might hit ESC # to comment out the command line because you suddenly > realize that you need to do something else first. Then you come back to > it (ESC k k ...), remove the # sign, finish typing the command, and run > it. But the backticked command was already executed much earlier than > you might have wished (two commands ago). Here's another argument why I feel that completion should NEVER rewrite your command line (not even to replace `pwd` with its value): I have been known to do things like this for testing parallel development trees: cd ~/dir run tests via a single command HOME=/path/to/alternate/ (to rerun cd) (to rerun tests, in a different directory) As long as tab completion on the cd command preserved the ~, then this scenario works. But as soon as completion "helpfully" rewrites ~ into /path/to/home, it breaks my work flow. Okay, so that work flow of abusing the definition of $HOME to swap between parallel work trees might not be the most common, but I hope it points out to you why completion should not be in the business of rewriting users commands, but only appending completions. And it can be applied to any other situation (substitute ~/$HOME with your choice of shell variable - if completion ever rewrites a command line with the expansion of $var rather than keeping literal $var in place, then you cannot alter $var in between repetitions of a command - even if completion had to temporarily expand $var in order to form better context about what I was completing after the point that $var appeared in my command line). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: read builtin and readonly variables
On 01/04/2011 08:05 AM, Eric Blake wrote: > I couldn't find anything either - the POSIX wording for readonly only > mentions assignment and unset as requiring errors. I think that's an > unintentional hole in POSIX, though, so I'm going ahead and submitting a > bug report to have readonly also mention read and getopts as being > required to error out on a readonly variable (and given that ksh treats > assignment different than unset on whether a non-interactive shell > exits, the extent of the reaction for getopts and read will probably > have to allow both behaviors). I found some other differences between shells: $ bash --posix -c 'cd /tmp; readonly PWD; echo $?; cd ..; echo $?-$PWD-$(pwd)' || echo abort,$? 0 bash: PWD: readonly variable 0-/tmp-/ $ bash -c 'cd /tmp; readonly PWD; echo $?; cd ..; echo $?-$PWD-$(pwd)' || echo abort,$? 0 bash: PWD: readonly variable 0-/tmp-/ $ ksh -c 'cd /tmp; readonly PWD; echo $?; cd ..; echo $?-$PWD-$(pwd)' || echo abort,$? 0 0-/-/ Bash goes ahead and changes the directory but leaves PWD untouched (PWD is now inconsistent without warning!) in both posix and bash mode, whereas ksh (silently) ignores the request to make PWD readonly in the first place. Also, both shells abort a non-interactive shell when readonly interferes with export (but bash only aborts in posix mode): $ ksh -c 'readonly v; export v=a; echo $?-$a' || echo abort,$? ksh: line 1: v: is read only abort,1 $ bash -c 'readonly v; export v=a; echo $?-$a' || echo abort,$? bash: v: readonly variable 1- $ bash --posix -c 'readonly v; export v=a; echo $?-$a' || echo abort,$? bash: v: readonly variable abort,1 I've gone ahead and filed a POSIX interpretation request: http://austingroupbugs.net/view.php?id=367 Also, since the next version of POSIX will be mandating changes for cd (http://austingroupbugs.net/view.php?id=253 adds the new cd -e option to warn if PWD is inconsistent), the notion of a readonly PWD may affect how you implement that proposal. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: for; do; done regression ?
On 01/07/2011 08:39 AM, Chet Ramey wrote: > On 1/7/11 10:03 AM, Chet Ramey wrote: >> On 1/6/11 8:17 PM, Alexander Tiurin wrote: >>> Hi! >>> >>> I ran the command >>> >>> ~$ time for i in `seq 0 1` ; do echo /o/23/4 | cut -d'/' -f2 ; done >>>> /dev/null >>> >>> 6 times in a row, and noticed to the increase in execution time: >>> >> [...] >>> >>> how to interpret the results? >> >> It's hard to say without doing more investigation, but I suspect that the >> fork time is increasing because the bash binary is growing in size. >> >> I'd have to build a version with profiling enabled to tell for sure. > > I built a profiling version of bash-4.2 (without the bash malloc, since > linux doesn't let you replace malloc when you're profiling), and the > execution time was dominated by fork: around 55-60% of the time. That's > around 10-15 times more than any other function. Is it time to use posix_spawn() instead of fork() in the cases where we are spawning external processes? It doesn't help the fact that we have to still use fork() for subshells, but as the bash binary grows larger, posix_spawn() becomes more of a win over fork() by reducing kernel overhead spent in marking a larger memory footprint as copy-on-write, when that work is later discarded by an exec(). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
miscompilation at gcc -O2
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic uname output: Linux office 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23 16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.1 Patch Level: 7 Release Status: release Description: There is a report of bash being miscompiled for cygwin when using gcc 4.3.4 -O2, but succeeding when compiled with -O1: http://cygwin.com/ml/cygwin/2011-02/msg00230.html Compiling with -Wextra reveals the culprit: execute_cmd.c: In function ‘execute_function.clone.2’: execute_cmd.c:4007:23: warning: variable ‘bash_source_a’ might be clobbered by ‘longjmp’ or ‘vfork’ execute_cmd.c:4007:39: warning: variable ‘bash_lineno_a’ might be clobbered by ‘longjmp’ or ‘vfork’ execute_cmd.c: In function ‘execute_in_subshell’: execute_cmd.c:1296:12: warning: variable ‘tcom’ might be clobbered by ‘longjmp’ or ‘vfork’ POSIX is clear that the value of an automatic variable changed between setjmp() and the subsequent longjmp() is unspecified unless the variable is marked volatile, but bash is violating this constraint and modifying several variables that cannot reliably be restored. Depending on what code transformations the compiler makes, this can lead to crashes; in cygwin's case, it appears that mere execution of a trap return handler can cause bash to corrupt its own stack. Repeat-By: make rm execute_cmd.o make CFLAGS='-Wextra -O2' Fix: --- execute_cmd.c.orig 2011-02-09 11:53:13.470850670 -0700 +++ execute_cmd.c 2011-02-09 11:53:48.422939088 -0700 @@ -1293,7 +1293,7 @@ int user_subshell, return_code, function_value, should_redir_stdin, invert; int ois, user_coproc; int result; - COMMAND *tcom; + COMMAND *volatile tcom; USE_VAR(user_subshell); USE_VAR(user_coproc); @@ -4004,7 +4004,7 @@ char *debug_trap, *error_trap, *return_trap; #if defined (ARRAY_VARS) SHELL_VAR *funcname_v, *nfv, *bash_source_v, *bash_lineno_v; - ARRAY *funcname_a, *bash_source_a, *bash_lineno_a; + ARRAY *funcname_a, *volatile bash_source_a, *volatile bash_lineno_a; #endif FUNCTION_DEF *shell_fn; char *sfile, *t; -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash 4.2, parameter expansion problem
On 02/14/2011 10:51 AM, Juergen Daubert wrote: > Hello, > > I stumbled over the following while trying to build xterm from sources > with bash 4.2: > > $:~> /bin/sh --version | head -n1 > GNU bash, version 4.2.0(1)-release (i686-pc-linux-gnu) > $:~> /bin/sh > sh-4.2$ a="${b:-'/foo/bar'}" > sh: bad substitution: no closing `}' in ${b:-'/foo/bar'} > sh-4.2$ a="${b:-'bar'}" > sh-4.2$ a="${b:-/foo/bar}" > sh-4.2$ > > looks like bash, when called as sh, doesn't like the / character in > single-quoted strings. Looks like a bug in how bash was modified for trying to obey this new POSIX rule: http://austingroupbugs.net/view.php?id=221 -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Do more testing before a release?
On 02/16/2011 09:51 PM, Clark J. Wang wrote: > I know little about open source development process (and control?). I just > don't know where to get the bash code (like CVS, SVN respository) before > it's released. I think it's better to make it open to more people so > everyone can help review and test before a stable release. Unlike most open source projects, Chet has chosen to not expose the daily repository. Your only option is to track release candidates, or ask Chet to join the bash-testers list so you can also have access to his alpha builds a month or two before the official release. However, I do agree with your sentiment that if the daily repository were more open to the public, that it would allow for a wider set of contributions from other developers. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: typeset -r prevents local variable of same name.
On 02/16/2011 08:13 PM, Chet Ramey wrote: > On 2/13/11 3:17 PM, ste...@syslang.net wrote: >> Configuration Information [Automatically generated, do not change]: >> Machine: i386 >> OS: linux-gnu >> Compiler: gcc >> Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i386' >> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i386-redhat-linux-gnu' >> -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' >> -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE >> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -pipe -Wall >> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector >> --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic >> -fasynchronous-unwind-tables >> uname output: Linux saturn.syslang.net 2.6.27.41-170.2.117.fc10.i686.PAE #1 >> SMP Thu Dec 10 10:48:30 EST 2009 i686 athlon i386 GNU/Linux >> Machine Type: i386-redhat-linux-gnu >> >> Bash Version: 3.2 >> Patch Level: 39 >> Release Status: release >> >> Description: >> First, I already submitted this bug from work, but I didn't >> realize that the address I sent from would not be allowed to receive >> a response. This address will work fine. >> >> If I declare a variable at the top scope using -r, it will prevent me >> from declaring a local copy in a subroutine. This problem happens in >> this version of bash as well as in bash4 under Fedora 14. > > This is intentional. A variable is declared readonly for a reason, and > readonly variables may not be assigned to. I don't believe that you > should be able to use a function to circumvent this. Consensus on today's Austin Group meeting was that since we are interested in standardizing local variables (or at least a subset of the 'typeset' special built-in's capabilities), this needs to be uniform across implementations. The Austin Group would favor the ability to create a local read-write variable that shadows a global read-only variable, which would entail a change to this bash behavior. [Which reminds me - I still have the action item to propose wording for getting typeset into the next revision of POSIX] -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: typeset -r prevents local variable of same name.
On 02/17/2011 07:48 PM, Chet Ramey wrote: > Consider a quick, contrived example: an administrator writes a shell > package (library, set of functions, whatever) that includes, among > other things, ways to make sure that some other package is invoked with > a particular set of arguments and environment. He does this in part by > declaring some variables readonly. Programs invoked by this package > change their behavior depending on the value of environment variables, > so it's important to the correct operation of this script that the > variables don't change. It should be harder to cirvumvent this, possibly > creating a security hole, than just declaring a shell function with a > local variable that then calls a public function that expects the variable > to have some other value. Ah, so we're back to the debate of static vs. dynamic scoping. David Korn is insistent that if POSIX standardizes typeset that only static scoping be standardized, whereas bash currently only implements dynamic scoping (but static scoping could be added on top of that, via appropriate options to typeset). Overriding statically scoped variables is not a security risk, but overriding dynamically scoped variables is asking for problems. I agree with bash's current implementation restrictions, given its current scoping rules. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: typeset -r prevents local variable of same name.
On 02/17/2011 08:18 PM, Chet Ramey wrote: > On 2/17/11 10:12 PM, Eric Blake wrote: >> On 02/17/2011 07:48 PM, Chet Ramey wrote: >>> Consider a quick, contrived example: an administrator writes a shell >>> package (library, set of functions, whatever) that includes, among >>> other things, ways to make sure that some other package is invoked with >>> a particular set of arguments and environment. He does this in part by >>> declaring some variables readonly. Programs invoked by this package >>> change their behavior depending on the value of environment variables, >>> so it's important to the correct operation of this script that the >>> variables don't change. It should be harder to cirvumvent this, possibly >>> creating a security hole, than just declaring a shell function with a >>> local variable that then calls a public function that expects the variable >>> to have some other value. >> >> Ah, so we're back to the debate of static vs. dynamic scoping. > > Not really. The readonly variables could be declared at the global > scope. Overriding a global variable can cause the same problem. With static scoping the ONLY place that sees the local variable override is the intermediate shell function. If the intermediate function calls a public function, that public function will still see the (readonly) global variable. (Think C or ksh local variables.) It's only when dynamic scoping is in the mix, where the grandchild function sees the local variables of the intermediate function instead of the global variables, where you no longer want to allow overriding readonly variables. (Think lisp or bash local variables.) I fail to see how overriding a global variable with a statically scoped local can cause problems, since that local cannot be viewed outside the function in a language with static scoping. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Strange bug in arithmetic function
On 02/21/2011 02:13 AM, Marcel de Reuver wrote: > In a bash script I use: $[`date --date='this week' +'%V'`%2] to see if > the week number is even. > Only in week 08 the error is: bash: 08: value too great for base > (error token is "08") the same in week 09, all others are Ok... 08 is an invalid octal number. Try forcing decimal instead: $[$(date --date='this week' +'10#%V')%2] -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: How to match pattern in bash?
On 02/22/2011 08:24 PM, Peng Yu wrote: > Suppose that I have a variable $x, I want to test if the content of $x > match the pattern 'abc*'. If yes, then do something. (The operator == > doesn't match patterns, if I understand it correctly.) > > Is there such a build-in feature in bash? Or I have to rely on some > external program such as perl to test the pattern matching? case $x in abc*) ... ;; esac -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: bash tab variable expansion question?
On 02/24/2011 03:14 PM, Michael Kalisz wrote: > $ echo $PWD/ > will expand the $PWD variable to your current directory > > while in bash, version 4.2.0(1)-release: > > $ echo $PWD/ > will just escape the $ in front of the $ variable i.e: > > $ echo \$PWD/ > The shell-expand-line (Ctrl-Alt-e) works but before I could use just TAB > > Any hints why? Any way to get the 4.1 behavior in 4.2? > > Can someone confirm... Is this a bug or a feature? I'm not the developer, but in my mind, this is a welcome feature. TAB-completion should NOT modify what I typed, and I consider the 4.1 behavior to be the bug. Consider if I have parallel directory structures a/c and b/c. If I do: d=a $d/c/test d=b $d/c/test I want to run two different programs. Now, instead of a one-letter name, consider that it is something longer, like $HOME. If typing TAB expands the variable, instead of keeping it intact, then I can't do: $HOME/c/t-TAB $HOME=b UP-UP-ENTER to repeat my test in a new directory, since tab completion wiped out that I want to evaluate $HOME every time. (The same goes for command substitution - bash should never pre-maturely lock me in to a single expansion during tab completion.) -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [bash-bug] no local bash_history created or written to if existing (~/.bash_history
On 03/08/2011 11:00 AM, Andreas Schwab wrote: > "Dr. Werner Fink" writes: > >> On Tue, Mar 08, 2011 at 12:02:53PM -0500, Chet Ramey wrote: >>>> >>>> Does this mean that the attached patch could also not work >>>> on some systems? Or does this interfere with the readline >>>> library? >>> >>> Since longjmp is not on the list of functions that is safe to call >>> from a signal handler, yes, that's what it means. OTOH, this shows >>> promise as a solution. >> >> OK, that means only for systems with HAVE_POSIX_SIGSETJMP >> defined. At least this provides a (local) solution here > > sigsetjmp is the same as setjmp. Both will lead to deadlocks. sigsetjmp is safe to call from a signal handler _if_ you are in the handler because of a synchronous signal (such as one generated internally by raise(), and not generated externally and asynchronously by kill()), or if you can prove that it was not interrupting any other non-async-signal-safe function (however, blocking all signals around all calls to non-async-signal-safe functions is very inefficient). But deadlock is indeed possible if an asynchronous SIGHUP occurs while the malloc() lock is held (if you try to malloc() in the cleanup code, but siglongjmp() left the middle of earlier code that already held the malloc() lock, then you indeed have deadlock). Which is why POSIX does not list siglongjmp() as an async-signal-safe function, because after siglongjmp(), you are generally only safe to invoke async-signal-safe functions, which is no better than invoking those same functions directly within the signal handler itself in the first place. Really, the only safe way to handle things like SIGHUP cleanup is to have the signal handler record that an exception occurred, then have the main processing loop checking that variable frequently enough to do cleanup in a reasonable time-frame (possibly by using a pipe-to-self if the main loop is waiting on select()), where the main loop then re-raises the signal after doing cleanup at a point where all functions are safe. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: variable name and its' value are the same characters causes recursion error
On 03/09/2011 02:54 PM, Chet Ramey wrote: >> >> For example: >> >> unset a; declare a="a"; [[ a -lt 3 ]]; echo $? >> bash: [[: a: expression recursion level exceeded (error token is "a") >> 1 >> >> Shouldn't the return code from this expression be 2, rather than 1? > > What does it matter? Failure is failure. Except that [[ explicitly documents that 0 and 1 imply a syntactically valid expression, reserving $?==2 for catastrophic failure. The argument here is that infinite recursion is catastrophic and represents an invalid expression, and should not be confused with $?==1 meaning a valid but false expression. Similarly to expr(1) returning 0 and 1 depending on value for success, and > 1 on failure. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Problem with open and rm
On 03/16/2011 04:54 AM, Barrie Stott wrote: > The script that follows is a cut down version of one that came from elsewhere. > > #!/bin/bash > > cp /tmp/x.html /tmp/$$.html > ls /tmp/$$.html > [ "$DISPLAY" ] && open /tmp/$$.html > ls /tmp/$$.html > rm -f /tmp/$$.html Instead of passing Safari the name of a temporary file, why not pass it the name of a temporary pipe? Does this work any better: [ "$DISPLAY" ] && open <(cat /tmp/x.html) at which point there's no temporary file to remove. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Using alias to commands disables its completion
On 03/24/2011 10:54 AM, Gustavo Serra Scalet wrote: > Description: > When an alias is supplied to a command (e.g: alias c="cd") the > complete doesn't complete like it used when using the original text, > without > using alias (e.g $ c # returns all files, not just > directories) > > It doesn't seem to have any technical issue problem to make the > detection of > completion also by using alias (as when is hit the alias are > also > interpreted). The lame answer: But you can already do this yourself! Write a shell function around alias, that calls both 'command alias' to do the real work, as well as 'complete -p' on the first word of new expansion then 'complete ...' on the new alias name, so that you copy any completion properties tied to the old name over to the new name. As long as you define your programmable completions before your aliases as part of your ~/.bashrc startup sequence, then this works. Even better, submit that as an enhancement request to the bash-completion project to have bash-completion provide that wrapper around alias provided automatically as part of starting up bash-completion. The answer you sort of wanted: Yes, it would be nice to patch to bash's completion routines to add an opt-in ability to check for programmed completion associated with whatever the alias expanded to, and use that when there is no completion already associated with the aliased name. But someone has to write such a patch. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: eval
[adding bug-autoconf to document a NetBSD bug] On 05/05/2011 07:23 AM, Chet Ramey wrote: > On 5/4/11 4:40 PM, Rafael Fernandez wrote: >> I am used to the behavior of sh on NetBSD which does not require >> enclosing positional parameters. > > I'd call that a pretty serious incompatibility on the part of ash and its > descendants (BSD sh, dash, etc.). There's no good reason that > > set -- a b c d e f g h i j > echo $10 > > should echo `j'. Also a POSIX violation: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02 "The parameter name or symbol can be enclosed in braces, which are optional except for positional parameters with more than one digit or when parameter is followed by a character that could be interpreted as part of the name." And worth documenting as a portability pitfall in the autoconf manual. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: eval
On 05/05/2011 08:43 AM, Chet Ramey wrote: >> "The parameter name or symbol can be enclosed in braces, which are >> optional except for positional parameters with more than one digit or >> when parameter is followed by a character that could be interpreted as >> part of the name." > > I agree with this interpretation, but the following sentence can be > interpreted as placing the burden on the shell programmer: > > "When a positional parameter with more than one digit is specified, the > application shall enclose the digits in braces (see Parameter Expansion)." > > Still, sh has required the braces since time immemorial. It makes no > sense that ash would have done it differently Additionally from POSIX: "If the parameter name or symbol is not enclosed in braces, the expansion shall use the longest valid name (see XBD Name)" Then XBD Name: "In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set. The first character of a name is not a digit." In "$10", 10 is not a name, so the longest name after $ is the empty string, and in place of a name, we use the single-character symbol 1 instead, such that this MUST be parsed as ${1}0, not as ${10}. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Case insensitivity seems to ignore lower bound of interval
On 05/05/2011 10:59 AM, Aharon Robbins wrote: >> No need. It's already been attempted in the past. >> http://sourceware.org/bugzilla/show_bug.cgi?id=12045 >> http://sourceware.org/bugzilla/show_bug.cgi?id=12051 > > The first one looks like the main one. Has Ulrich responded to it? 12045 is a documentation request (to at least mention why CEO [collation element ordering] appears screwy for different locales, and how to properly write a locale file so that CEO does what you normally expect), which, to my knowledge, Uli hasn't responded to. 12051 is a request to change away from CEO to any other scheme, and Uli flat out rejected that, even though POSIX no longer requires CEO. In other words, Uli says that the burden is on the locale file writer, and not on glibc. But without good documentation on how to write a proper locale file, and without prods to all of the owners of broken locale files, it's an uphill battle to get CEO ordering to consistently be useful. And that only covers you if you are on a system with glibc which uses CEO semantics, rather than on any other system where the libc semantics of range expressions has who-knows-what behavior. Yes, it would really be nice if all four of bash, gawk, sed, and grep could agree on the same interpretation of non-C range semantics, and implement that regardless of the underlying libc behavior. But what semantics should we settle on? It's an age-old problem, with no nice solutions. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Yet another quoting question
On 05/06/2011 09:02 AM, Steven W. Orr wrote: > 4.0.35(1)-release (x86_64-redhat-linux-gnu) > > I have a bunch of arrays, and some of the arrays' values are null or > might contain spaces. printf %q is your friend. $ a[0]=aaa a[1]= a[2]='bbb ccc' $ printf '%q ' "${a[@]}" aaa '' bbb\ \ ccc > I'd love to see an elegant solution. :-) OK, so printf %q prefers '' and \ over "", but the end result is an unambiguous array listing that can be fed back into shell code. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Bash-4.2 patches 9 and 10
On 05/13/2011 04:54 AM, Joachim Schmitz wrote: > Hi folks > > Just found Bash-4.2 patches 9 and 10, strange that they never got > announced? Yes they did: http://lists.gnu.org/archive/html/bug-bash/2011-05/msg00014.html http://lists.gnu.org/archive/html/bug-bash/2011-05/msg00015.html Along with an accidental announcement for a missing patch 11: http://lists.gnu.org/archive/html/bug-bash/2011-05/msg00028.html -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Shell case statements
[adding bug-bash] On 05/16/2011 07:23 PM, Wayne Pollock wrote: > (While cleaning up the standard for case statement, consider that it is > currently > unspecified what should happen if an error occurs during the expansion of the > patterns; as expansions may have side-effects, when an error occurs on one > expansion, should the following patterns be expanded anyway? Does it depend > on > the error? It seems reasonable to me that any errors should immediately > terminate > the case statement.) Well, that's rather all over the place, but yes, it does seem like bash was the buggiest of the lot, compared to other shells. Interactively, I tested: readonly x=1 case 1 in $((x++)) ) echo hi1 ;; *) echo hi2; esac echo $x.$? bash 4.1 printed: bash: x: readonly variable hi1 1.0 which means it matched '1' to $((x++)) before reporting the failure assign to x, and the case statement succeeded. Changing the first "1" to any other string printed hi2 (the * case). zsh printed: zsh: read-only variable: x 1.0 which means it aborted the case statement before executing any clauses, but left $? at 0. ksh printed: ksh: x: is read only 1.1 which means that both the case statement was aborted, and $? was impacted. dash printed: dash: arithmetic expression: expecting primary: "x++" 1.2 so it was like ksh other than choice of error status. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Shell case statements
On 05/20/2011 09:33 AM, Chet Ramey wrote: >> Well, that's rather all over the place, but yes, it does seem like bash >> was the buggiest of the lot, compared to other shells. Interactively, I >> tested: >> >> readonly x=1 >> case 1 in $((x++)) ) echo hi1 ;; *) echo hi2; esac >> echo $x.$? >> >> bash 4.1 printed: >> bash: x: readonly variable >> hi1 >> 1.0 >> which means it matched '1' to $((x++)) before reporting the failure >> assign to x, and the case statement succeeded. Changing the first "1" >> to any other string printed hi2 (the * case). > > Thanks for the report. This was an easy fix. The variable assignment > error was actually handled correctly, the expression evaluation code > just didn't pay enough attention to the result. How about the even simpler: $ bash -c 'readonly x=5; echo $((x=5))'; echo $? bash: x: readonly variable 5 0 $ Other shells abort rather than running echo: $ ksh -c 'readonly x=5; echo $((x=5))'; echo $? ksh: line 1: x: is read only 1 $ zsh -c 'readonly x=5; echo $((x=5))'; echo $? zsh:1: read-only variable: x 1 $ dash -c 'readonly x=5; echo $((x=5))'; echo $? dash: x: is read only 2 $ -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: BUG? RFE? printf lacking unicode support in multiple areas
On 05/20/2011 02:30 PM, Linda Walsh wrote: > i.e. it's showing me a 16-bit value: 0x203c, which I thought would be the > wide-char value for the double-exclamation. Going from the wchar > definition > on NT, it is a 16-bit value. Perhaps it is different under POSIX? but > 0x203c taken as 32 bits with 2 high bytes of zeros would seem to specify > the same codepoint for the Dbl-EXcl. POSIX allows wchar_t to be either 2-byte or 4-byte, although only a 4-byte wchar_t can properly represent all of Unicode (with 2-byte wchar_t as on windows or Cygwin, you are inherently restricted from using any Unicode character larger than 0x if you want to maintain POSIX compliance). > >> Since there is no way to produce a word containing a NUL character it is >> impossible to support %lc in any useful way. > > That's annoying. How can one print out unicode characters > that are supposed to be 1 char long? I think you are misunderstanding the difference between wide characters (exactly one wchar_t per character) and multi-byte characters (1 or more char [byte] per character). Unicode can be represented in two different ways. One way is with wide characters (every character represents exactly one Unicode codepoint, and code points < 0x100 have embedded NUL bytes if you view the memory containing those wchar_t as an array of bytes). The other way is with multi-byte encodings, such as UTF-8 (every character occupies a variable number of bytes, and the only character that can contain an embedded NUL byte is the NUL character at codepoint 0). Bash _only_ uses multi-byte characters for input and output. %lc only uses wchar_t. Since wchar_t output is not useful for a shell that does not do input in wchar_t, that explains why bash printf need not support %lc. POSIX doesn't require it, at any rate, but it also doesn't forbid it as an extension. > This isn't just a bash problem given how well most of the unix "character" > utils work with unicode -- that's something that really needs to be solved > if those character utils are going to continue to be _as useful_ in the > future. > Sure they will have their current functionality which is of use in many > ways, but > for anyone not processing ASCII text it becomes a problem, but this > isn't really > a bash is. Most utilities that work with Unicode work with UTF-8 (that is, with multi-byte-characters using variable number of bytes), and NOT with wide characters (that is, with all characters occupying a fixed width). But you can switch between encodings using the iconv(1) utility, so it shouldn't really be a problem in practice in converting from one encoding type to another. > That said, it was my impression that a wchar was 16-bits (at least it > is on MS. Is it different under POSIX? POSIX allows 16-bit wchar_t, but if you have a 16-bit wchar_t, you cannot support all of Unicode. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: documentation bug re character range expressions
On 06/03/2011 10:15 AM, Marcel (Felix) Giannelia wrote: > Alright -- assuming that for the moment, how does one specify > [ABCDEFGHIJKL] using [[:upper:]]? This is something that I haven't seen > documented, and I'm genuinely curious. [ABCDEFGHIJKL] If you ever want a subset of [[:upper:]], the _only_ portable ways to guarantee that you are getting just that subset are to use the C locale or to spell out the range yourself. In short, ranges are non-portable in all other locales. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug in 'set -n' processing
Bash has a bug: ${+} is syntactically invalid, as evidenced by the error message when running the script, yet using 'set -n' was not able to flag it as an error. $ echo $BASH_VERSION 4.2.8(1)-release $ bash -c 'echo ${+}'; echo $? bash: ${+}: bad substitution 1 $ bash -cn '${+}'; echo $? 0 $ ksh -cn '${+}'; echo $? ksh: syntax error at line 1: `+' unexpected 3 Meanwhile, a feature request: since $+ outputs a literal "$+", it is proof that + cannot be a valid variable name. Bash should follow ksh' lead by having 'set -n' warn about suspicious but usable constructs, at least when --posix is not in effect. $ bash -c 'echo $+' $+ $ bash -cn '$+'; echo $? 0 $ ksh -cn '$+'; echo $? ksh: warning: line 1: $ not preceded by \ 0 -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: documentation bug re character range expressions
On 06/03/2011 11:36 AM, Marcel (Felix) Giannelia wrote: > It sounds to me like what you're saying is, the *only* uses of bracket > range expressions guaranteed to be "portable" are things like [[:upper:]] > and [[:lower:]]. But I put "portable" in quotation marks just then, > because to my mind the word "portable" implies "has the same behaviour on > all systems", whereas things like [[:upper:]] are locale-dependent; they > change their behaviour depending on system settings. Actually, that _is_ portable, because if you have the _same_ locale on two different machines, you will get the _same_ locale-dependent behaviors from those two machines. But you do have a point - even POSIX admits that different vendors have varying locale description files, so the "en_US" locale is not consistent between two machines if the locale definitions were not written by the same person. > [0-9] presumably still works consistently across all platforms -- I hope? [0-9] is a special case. It is the _ONLY_ range that POSIX requires to be invariant across all locales, so in practice, you can use this range expression to your heart's content. But there is still the POSIX wording that [0-9] may have implementation-defined behavior for non-C locales, so you might _still_ be safer using [0123456789] or [[:digit:]]. > > I think a good solution to this, then, is to just deprecate the use of "-" > in bracket expressions entirely. Which is effectively what POSIX did by stating that range expressions outside of the C locale have unspecified effects. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: exit-on-error (set -e) option and command after "if" not found
On 06/22/2011 06:51 AM, Dmitry Bolshakov wrote: > Hi > > set -e > if nosuchcommand | grep blabla > then > echo ERROR: blabla was found > exit 1 > fi > echo it was unexpected for me, I thougt this line should not be echoed Not a bug; this last echo is supposed to happen. The documentation for 'set -e' is clear: Exit immediately if a pipeline (which may consist of a single simple command), a subshell command enclosed in parentheses, or one of the commands executed as part of a command list enclosed by braces (see SHELL GRAMMAR above) exits with a non-zero status. The shell does not exit if the command that fails is part of the command list immediately following a while or until keyword, part of the test following the if or elif reserved words, part of any command executed in a && or ⎪⎪ list except the command following the final && or ⎪⎪, any command in a pipeline but the last, or if the command's return value is being inverted with !. Since the pipeline 'nosuchcommand | grep blabla' is executed as part of the test following an 'if', then 'set -e' does not abort the shell on failure. 'set -e' is a bear to use - it generally does not protect you from everything that you think it ought to, and has a number of portability bugs to boot as you migrate between versions of bash or between other shells. Not to mention that the POSIX folks can't even seem to get it right; the definition of 'set -e' had to be amended even after POSIX 2008 to match historical practice, which in turn disagreed with bash practice at the time. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: weird behavior of set -e
On 06/24/2011 03:51 AM, Harald Dunkel wrote: > Hi folks, > > A colleague pointed me to this problem: If I run > > ( set -e; ( false; echo x ) ) > > in bash 4.1.5, then there is no screen output, as > expected. If I change this to > > ( set -e; ( false; echo x ) || echo y ) > > then I get "x" instead of "y". How comes? Because '(false; echo x)' is on the left hand of ||, which disables set -e for that portion of the command line. ksh behaves the same way, so it is not a bash bug. > Any helpful comment would be highly appreciated. set -e seldom does exactly what you want - even the writers of POSIX 2008 got it wrong, and here's how they corrected it: http://austingroupbugs.net/view.php?id=52 and that's what bash 4.1 implemented. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Yet Another test option
On 07/06/2011 10:37 AM, Bruce Korb wrote: > On 07/06/11 09:03, Chet Ramey wrote: >>> /usr/bin/test ? >>> >>> Do this first in the binary then migrate to bash's test? >> >> I was actually making an argument for an entirely separate utility to do >> this. That could be a shell script encapsulating the proper version >> comparison logic. > > which basically means a script wrapping "sort -V" and testing whether > the arguments got reordered or not: > > if test "X$1" = "X$3" > then is_eq=true ; is_lt=false > else > is_eq=false > first=$(printf '%s\n%s\n' "$1" "$2" | sort -V | head -1) > test "X$first" = "X$1" && is_lt=true || is_lt=false > fi Oh, that's rather heavyweight - a command substitution and 3 pipeline components. Why not just one child process, by using sort -c and a heredoc? is_eq=false is_lt=false if test "x$1" = "$x2"; then is_eq=true elif sort -cV </dev/null; then $1 $2 EOF is_lt=true fi > and if that proved insufficient, then "sort -V" would need an adjustment. > I would not expect "sort -V" and a version test to disagree. The code that coreutils uses for 'sort -V' is part of gnulib - the filevercmp module. That file (filevercmp.c) is pretty stable nowadays, with the last algorithmic change being in April 2009 and no recent complaints about unexpected behavior (whereas glibc's strverscmp is locked into behavior, but that behavior raises complaints). For reference, the documentation is: /* Compare version strings: This function compares strings S1 and S2: 1) By PREFIX in the same way as strcmp. 2) Then by VERSION (most similarly to version compare of Debian's dpkg). Leading zeros in version numbers are ignored. 3) If both (PREFIX and VERSION) are equal, strcmp function is used for comparison. So this function can return 0 if (and only if) strings S1 and S2 are identical. It returns number >0 for S1 > S2, 0 for S1 == S2 and number <0 for S1 < S2. This function compares strings, in a way that if VER1 and VER2 are version numbers and PREFIX and SUFFIX (SUFFIX defined as (\.[A-Za-z~][A-Za-z0-9~]*)*) are strings then VER1 < VER2 implies filevercmp (PREFIX VER1 SUFFIX, PREFIX VER2 SUFFIX) < 0. This function is intended to be a replacement for strverscmp. */ However, I don't see any reason to add extensions to coreutils' test unless we have some agreement that we plan to add the same extension to other places like the bash builtin test at the same time. Since we've already demonstrated that version comparisons are a pretty trivial wrapper around sort, I'm not seeing much justification in favor of bloating test to make version testing builtin. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Problem with line buffered IO when no tty
On 07/07/2011 06:55 AM, Steven W. Orr wrote: > So, why is it that bash is behaving like it is always line buffered or > unbuffered, even if there is no console? Because POSIX requires 'sh' to behave like it is unbuffered: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html When the shell is using standard input and it invokes a command that also uses standard input, the shell shall ensure that the standard input file pointer points directly after the command it has read when the command begins execution. It shall not read ahead in such a manner that any characters intended to be read by the invoked command are consumed by the shell (whether interpreted by the shell or not) or that characters that are not read by the invoked command are not seen by the shell. Bash meets this requirement by reading one byte at a time on non-seekable input, rather than relying on the decision of libc on whether stdin defaults to fully-buffered or line-buffered. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Built-in printf Sits Awkwardly with UDP.
[adding coreutils] On 07/20/2011 07:34 AM, Ralph Corderoy wrote: BTW, the code for the built-in printf has a bug. For negative field-widths it negates a negative integer without checking it will fit. E.g. on this 64-bit machine $ printf '%-9223372036854775808s.\n' foo foo. $ Coreutils' printf shares this misfortune. Sadly, it might even be a bug in the underlying glibc printf(), although I haven't tried to write a test program to check that, yet. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: How to do? Possible?
On 07/25/2011 03:45 PM, Linda Walsh wrote: I mistyped that but it brings me to an interesting conundrum: GLOBAL="hi there" {foo=GLOBAL echo ${!foo}; } This says: evaluate ${!foo}, and pass that expansion to 'echo', with foo=GLOBAL in the environment of echo. You are invoking behavior that POSIX leaves undefined (that is, bash is perfectly fine evaluating ${!foo} prior to assigning foo, but bash would also be okay if it assigned foo prior to evaluating ${!foo}. Hence, you got no output. But: { foo=GLOBAL;echo ${!foo}; } > hi there The extra ; forces the semantics. Here, the assignment to foo is a different statement than the expansion of of ${!foo}. And while ${!foo} is a bash extension, it still proves that this is a case where foo was assigned prior to its use. Weird... Not if you think about it properly. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Fallback language for internationalized string
On 07/27/2011 09:05 AM, Andreas Schwab wrote: Anders Sundman writes: Is it possible to get bash to use a fallback language for showing localized strings if no translation exists for the current language? Instead of using the 'raw' msgid that is. The msgid is supossed to be the fallback. That's how gettext works. Not entirely. See the glibc documentation for the LANGUAGE environment variable. While for the `LC_xxx' variables the value should consist of exactly one specification of a locale the `LANGUAGE' variable's value can consist of a colon separated list of locale names. The attentive reader will realize that this is the way we manage to implement one of our additional demands above: we want to be able to specify an ordered list of language. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Fallback language for internationalized string
On 07/27/2011 09:35 AM, Anders Sundman wrote: Andreas Schwab wrote: Anders Sundman writes: Is it possible to get bash to use a fallback language for showing localized strings if no translation exists for the current language? Instead of using the 'raw' msgid that is. The msgid is supossed to be the fallback. That's how gettext works. There are unfortunately two problems with this. I was hoping that a specific language fallback would fix them. 1. If you try to use the same msgid twice in a script you get an error when extracting it. Why? It should be possible to use the same msgid twice, if you are okay using the same translation twice. > Using the same human readable string twice is however a valid use case. So using 'wierd' (e.g. numbered) msgids make sense. But you don't ever want the user to see this. If you want weird msgids that are not usable directly, then your code must do a comparison after the translation. If the translation resulted in the msgid, then you use your sane fallback; if it resulted in a different string, then you got a translation. 2. If you use en strings as msgids and you later have to fix a simple spelling error for en, then all translation files have to be modified. Yes, but gettext provides tools to make that modification easy. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: The mailing list software interfered with my content
On 08/03/2011 04:45 PM, What, me urgent? wrote: The mailing list software interfered with my content = OUCH! In my most recent post, the mailing list software replaced the string "address@hidden" for a section of code snippet! Not the list software, but merely the web archiver that you are viewing the mail in. If you are actually subscribed to the list, rather than viewing a web archiver, your post came through just fine. Furthermore, not all web archivers use the same mangling; while http://lists.gnu.org/archive/html/bug-bash/2011-08/msg00012.html has your unfortunate "address@hidden", the same message is easier to understand here: http://thread.gmane.org/gmane.comp.shells.bash.bugs/16780/focus=16890 -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: bug: return doesn't accept negative numbers
On 08/05/2011 05:41 PM, Linda Walsh wrote: I guess I don't use negative return codes that often in shell, but I use them as exit codes reasonably often. 'return' barfs on "return -1"... Since return is defined to take no options, and ONLY an integer, as the return code, it shouldn't be hard to fix. According to POSIX, it's not broken in the first place. Portable shell is requires to pass an unsigned decimal integer, no greater than 255, for defined behavior. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#return Seem to fail on any negative number, but 'exit status' is defined as a short int -- not an unsigned value (i.e. -1 would return 255). In bash, 'return -- -1' sets $? to 255 (note the --). But since that is already an extension (POSIX does not require 'return' to support -- any more than it is required to support an argument of -1), I agree with your argument that bash would be more useful if, as an extension to POSIX, it would handle 'return -1' - in fact, that would match ksh behavior. Conversely, since portable code already can't use it, it's no skin off my back if nothing changes here. $ bash -c 'f() { return -- -1; }; f; echo $?' 255 $ bash -c 'f() { return -1; }; f; echo $?' bash: line 0: return: -1: invalid option return: usage: return [n] 2 $ dash -c 'f() { return -- -1; }; f; echo $?' return: 1: Illegal number: -- $ dash -c 'f() { return -1; }; f; echo $?' return: 1: Illegal number: -1 $ ksh -c 'f() { return -- -1; }; f; echo $?' 255 $ ksh -c 'f() { return -1; }; f; echo $?' 255 $ -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: bug: return doesn't accept negative numbers
On 08/07/2011 02:35 PM, Linda Walsh wrote: Eric Blake wrote: On 08/05/2011 05:41 PM, Linda Walsh wrote: Seem to fail on any negative number, but 'exit status' is defined as a short int -- not an unsigned value (i.e. -1 would return 255). In bash, 'return -- -1' sets $? to 255 (note the --). But since that is already an extension (POSIX does not require 'return' to support -- any more than it is required to support an argument of -1), I agree with your argument that bash would be more useful if, as an extension to POSIX, it would handle 'return -1' - in fact, that would match ksh behavior. Conversely, since portable code already can't use it, it's no skin off my back if nothing changes here. --- How about portable code using: (exit -1); return That's not portable, either. exit is allowed to reject -1 as invalid. POSIX is clear that exit and return have the same constraints - if an argument is provided, it must be 0-255 to be portable. However, you are on to something - since bash allows 'exit -1' as an extension, it should similarly allow 'return -1' as the same sort of extension. The fact that bash accepts 'exit -1' and 'exit -- -1', but only 'return -- -1', is the real point that you are complaining about. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: bug: return doesn't accept negative numbers
On 08/08/2011 08:14 PM, Chet Ramey wrote: On 8/8/11 9:42 PM, Mike Frysinger wrote: On Monday, August 08, 2011 21:20:29 Chet Ramey wrote: On 8/8/11 8:53 AM, Eric Blake wrote: However, you are on to something - since bash allows 'exit -1' as an extension, it should similarly allow 'return -1' as the same sort of extension. The fact that bash accepts 'exit -1' and 'exit -- -1', but only 'return -- -1', is the real point that you are complaining about. That's a reasonable extension to consider for the next release of bash. i posted a patch for this quite a while ago. not that it's hard to code. Sure. It's just removing the three lines of code that were added between bash-3.2 and bash-4.0. The question was always whether that's the right thing to do, and whether the result will behave as Posix requires. Yes, the result will behave as POSIX requires. POSIX requires that 'return' and 'exit' need not support '--' (since they are special builtins that do not specifically require compliance with the generic rules on option parsing), that they need not support options, and that if their optional argument is present, it need not be supported if it is not a non-negative integer no greater than 255. But they are _not_ required to reject any input outside the above constraints - therefore, an extension that supports '--', an extension that parses '-- -1' as 255, and an extension that parses any option that looks like a negative number such as 'exit -1', are ALL valid extensions permitted by POSIX, and need not be disabled by --posix, but can be available always. ksh does just that: 'return -1' and 'return -- -1' are always accepted and both result in the same behavior as the POSIX-mandated 'return 255'; ksh also has an extension where 'return --help' prints help, although bash uses 'help return' for this purpose. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Is bash dying or dead (was Re: 4.1 is "$((( ))) an 'official operator, if $(( )) isn't?
ng was tightened enough to actually be implementable by bash in a manner that matched ksh. Bash is becoming very unstable -- programs that work in 3.1 won't necessarily work in 3.2, those in 3.2 aren't compat with 4.0, 4.0 is different than 4.1, and now 4.2 is different than 4.1. That's because older bash has had bugs where it doesn't comply with POSIX, and those bugs have been fixed, but sometimes the fixes have consequences on the bash extensions. But if you use the POSIX subset, rather than the bash extensions, you should notice that newer bash is better, not worse, than older bash when it comes to running portable scripts. How can people write stable scripts in an enironment of constant change? By sticking to the common denominator that is known to work. Please people, am I being 'over-reactive'? Or are these valid concerns? At least in my view, you are coming across as over-reactive and ranting, even if that was not your intent. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: saving bash.....
On 08/10/2011 10:39 PM, Linda Walsh wrote: Chet Ramey wrote: > If not, then wouldn't > $((( ))) be turned into $( (( )) ), meaning the arith returns a > status, > and not the calculation. (I've tested this, and this is the case. Then I said: "It sounded to me like $(( )) would be translated into "$( () )", turning off arithmetic expansion. Did I read that ___incorrectly__? Yes, you read it incorrectly. POSIX is saying that _if_ you want to guarantee command substitution of a subshell (that is, $() with a leading ( as the first item of the command to run), then use spaces. But if you want to guarantee arithmetic expansion, then you _have_ to start with $((; and that if you start with $((, then arithmetic expansion takes precedence. If not, [i.e. if I read it correctly] then wouldn't $((( ))) be turned into $( (( )) ), meaning the arith returns a status, and not the calculation. (I've tested this, and this is the case)." Remember, $((body)) is treated as arithmetic expansion if body looks like arithmetic, and is only turned into $( (body)) otherwise (here, your body is '()'). POSIX' recommendation to add space is for the case of turning ambiguous $((body)) into definite command substitution $( (body)); you do not add space after the first '(' when you want arithmetic. ;-) ? (no insult intended! honest...just asking you (and others) to take a deep breath and not be reactive to what they think I'm am saying but try to focus on what I'm really "trying" (without great success, but I keep trying!) to say... Same here - I think we're all trying to be civil and helpful, and while emotions sometimes rise to the surface, hopefully everyone around is recognizing the underlying fact that no one here is intentionally trying to be malicious. I don't find it useful to have (()) cause exceptions. It's not a useful construct -- having it do so means it has to be programmed around. (()) is a useful construct, when used correctly; best designed for the condition of a loop expression where the loop will terminate when the count finally evaluates to 0. It does the same parsing and operation as $(()), except that instead of resulting in numeric output, it results in a change to $?. And once you realize that (()) is the shorthand counterpart to 'expr', and therefore has the same effect on $? (where 0 gives $?=1, non-zero gives $?=0, and error gives $?>1), then you can properly use this extension. Or, since it is an extension, you can just avoid it altogether, and write scripts more portably. It's not useful to have a function that is meant to return no value, -- if last calc'ed value was 0, to cause a script to failit may be POSIX, but I'm looking for bash that was useful to write script in and do some rudimentary programming with -- NOT a POSIX shell, I can get from Microsoft or any other vendor. If you want to use 'set -e', then you have to deal with arithmetic results of 0 setting $? to 1, whether the arithmetic results come from traditional 'expr' or extension '(())'. That's life with 'set -e'; my advice has always been to avoid 'set -e' as a crutch and instead do error checking yourself, since 'set -e' behavior is unintuitive by design. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: set -e yet again (Re: saving bash....)
On 08/12/2011 06:51 AM, Greg Wooledge wrote: On Thu, Aug 11, 2011 at 11:56:10PM -0700, Linda Walsh wrote: **Exception** declare -i a a=0 -- As a is declared to be an integer, it has the results evaluated at assignment time. a=0 is an integer expression that doesn't set $?=1 Neither should: ((a=0)) a=0 is an assignment. Assignments always return 0. No they don't. readonly a a=0 sets $? to non-zero. imadev:~$ ((a=0)); echo $? 1 And here, the same thing, but we return false, because the value was 0. This is the thing about which you are complaining. This is also one of the things I mention on http://mywiki.wooledge.org/BashFAQ/105 in which I describe how horrible and useless set -e is. And that point has been made several times in this thread. 'set -e' is a historical wart - bash has it because POSIX requires it. If you want to use bash extensions, then _don't_ use 'set -e', and you don't have to worry about how the unintuitive behavior interacts with extensions. Greg, you missed one other useful form: a=$((0)) This is an assignment (sets $? to 0 for all but errors like assigning to a readonly variable) of arithmetic expansion. Also POSIX, and slightly shorter than : $((a=0)) -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/15/2011 01:10 PM, Sam Steingold wrote: * Andreas Schwab [2011-08-15 18:42:30 +0200]: Sam Steingold writes: this works: $ alias z='echo a' $ zz(){ z b; } $ zz a b however, after sourcing this file: if true; then alias z='echo a' zz(){ z b; } fi Aliases are expanded during reading, but the alias command isn't executed until after the complete compound command was read. Cool. Now, what does this imply? Is this the expected behavior aka "feature"? Yep - feature. All shells behave that way. They parse to an end of a command (in your case, the end of the compound 'if-fi' command), then process statements within the command. Alias expansion affects parsing, so your alias cannot take effect until after the compound command has been parsed, and all attempts to use the alias from within the compound command were parsed with the pre-command expansion (ie. no alias). Yet another reasons why aliases are mostly replaced by functions. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/15/2011 04:40 PM, Sam Steingold wrote: * Andreas Schwab [2011-08-15 22:04:04 +0200]: Sam Steingold writes: Cool. Now, what does this imply? "For almost every purpose, shell functions are preferred over aliases." so, how do I write alias a=b as a function? (remember that arguments may contain spaces&c) a() { b "$@"; } -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/18/2011 08:38 AM, Sam Steingold wrote: mkdir z cd z touch a b 'c d' When doing exercises like this, I like to: touch a b 'c d' Notice the double spacing - it proves whether I used enough quoting throughout the exercise - if 'c d' with one space shows up anywhere, then I missed quoting, because word splitting followed by argument concatenation with only one space must have happened. how do I write a function that would print the same as $ \ls | cat a b c d $ f1(){ for a in "$*"; do echo $a; done; } Incorrect quoting on $a. Also, remember the difference between $* and $@ inside "" - the former creates only one word, and only the latter splits the result into the same number of words as were originally arguments to the function. You meant: f(){ for a; do echo "$a"; done; } or f(){ for a in "$@"; do echo "$a"; done; } (both are identical). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/18/2011 08:44 AM, Eric Blake wrote: how do I write a function that would print the same as $ \ls | cat Useless use of cat. This can be done with \ls -1. f(){ for a in "$@"; do echo "$a"; done; } Or skip the loop altogether: f(){ printf %s\\n "%@"; } -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Is this a bug in [[ -f ]]?
On 08/18/2011 10:35 AM, Steven W. Orr wrote: I have a symlink file and if if I test it I get success status with -f. 831 > ls -l errio err lrwxrwxrwx. 1 sorr fc 5 Aug 18 08:48 err -> errio -rw-rw-r--. 1 sorr fc 3816 Aug 18 08:48 errio 832 > [[ -f errio ]] 833 > echo $? # Good answer 0 *834 > [[ -h errio ]] 835 > echo $? # Good answer 1 *836 > [[ -f err ]] 837 > echo $? # BAD answer 0 Good answer. The man page says that symlinks are dereferenced for most tests (-h and -L being the exceptions). Since err is a symlink that resolves, -f err is true. Is this a bug? Is there a workaround? Am I doing something wrong? Merely that you were expecting -f to mean 'file and not symlink' rather than what was documented of 'file or symlink to file'. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/18/2011 11:38 AM, Stefano Lattarini wrote: Hi Eric. On Thursday 18 August 2011, Eric Blake wrote: On 08/18/2011 08:44 AM, Eric Blake wrote: how do I write a function that would print the same as $ \ls | cat Useless use of cat. This can be done with \ls -1. f(){ for a in "$@"; do echo "$a"; done; } Actually, echo "$a" is not portable - if you have any file names beginning with - or containing \, then the results can be corrupted. Or skip the loop altogether: f(){ printf %s\\n "%@"; } I think you've made a typo here; it should have been: f () { printf %s\\n "$@"; } Yep, slip of one key when I typed (at least on my keyboard, % and $ are neighbors). I guess that's what you meant, right? BTW, is this behaviour truly portable to other shells and/or printf utilities? POSIX seems to require it to portable, but you never know ... It's portable, but not always fast (some shells lack printf(1) as a builtin, and end up spawning a process). And in the case of arbitrary file names, printf is always better than echo, since it handles \ and leading - correctly. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: conditional aliases are broken
On 08/18/2011 08:53 AM, Roman Rakus wrote: On 08/18/2011 04:38 PM, Sam Steingold wrote: how do I write a function that would print the same as $ \ls | cat f3(){ printf "%s\n" "$@"; } "\n" looks funny in shell; even though POSIX requires that "\n" does not treat the \ as an escape but as a literal character, stylistically, I prefer writing "\\n" or '\n' to make it clear that I intended a literal backslash. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Is this a bug in [[ -f ]]?
On 08/19/2011 08:45 AM, Suvayu Ali wrote: I am trying to test if a file exists and then source it. My problem is the test succeeds even if the variable is empty! If I pass no argument at all, it still succeeds. To give you an example: $ unset bla $ [ -f $bla ]&& echo yes yes $ [ -f ]&& echo yes yes Both expected behaviors, and evidence of your lack of quoting. Remember, the behavior of [] depends on how many arguments are present. [ -f "$bla" ] (note the "") - guarantees that there are exactly two arguments, so it proceeds with the two-argument test where -f is the operator and "$bla" is the file name. [ -f ] (which is the same as [ -f $bla ] if $bla is empty, note the lack of "") - there is exactly one argument, so it proceeds with the one-argument test of whether the argument (the literal string -f) is empty (it is not). Furthermore, [ -f $bla ] is different than [[ -f $bla ]]. [ is a POSIX utility, and mandated to do all argument word expansion before [ ever gets a chance to see what arguments it was given - if $bla is empty or has spaces, you changed the number of arguments that are given to [. [[ is a bash (and ksh) extension that is part of the shell syntax (similar to how () for subshells is part of the syntax), thus it knows how many words, _pre-expansion_, were present, and the fact that $bla was unquoted is not a problem, [[ -f $bla ]] is a safe way to check if $bla is a file even if $bla is empty or contains spaces. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: variables set on command line
On 08/24/2011 09:24 AM, Sam Steingold wrote: CYGWIN_NT-5.2-WOW64 sds 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin BASH_VERSION='4.1.10(4)-release' at the bash prompt I observe this: $ f(){ echo a=$a b=$b c=$c ; } $ unset a b c $ a=a b=b f a=a b=b c= $ f a= b= c= which I believe is correct (i.e., variables set in "a=a b=b f" are unset after f terminates). This is bash's default behavior, but it violates POSIX. alas, when I call /bin/sh on the same machine, I see this: That tells bash to strictly obey POSIX, so you get the POSIX behavior. f(){ echo a=$a b=$b c=$c ; } f a= b= c= a=a b=b f a=a b=b c= f a=a b=b c= Which is indeed correct under the rules for POSIX (basically, POSIX requires function calls to behave like special built-ins, such that changes to the environment persist after the function call - the bash developer thinks the posix rule is counterintuitive, which is why the default bash behavior is opposite the posix behavior). Your question is not cygwin-specific. is this the expected behavior? Yes. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: variables set on command line
On 08/24/2011 10:07 AM, Sam Steingold wrote: * Eric Blake [2011-08-24 09:31:45 -0600]: f(){ echo a=$a b=$b c=$c ; } f a= b= c= a=a b=b f a=a b=b c= f a=a b=b c= Which is indeed correct under the rules for POSIX This sucks big time. Such is life when dealing with shell portability. So if I want to bind a variable for an eval invocation and do this: eval "`./libtool --tag=CC --config | grep '^archive_cmds='`" CC='${CC}' libobjs='$libs' deplibs='${CLFLAGS}' compiler_flags='${CFLAGS}' \ soname='$dll' lib='$lib' output_objdir='$dyndir' \ eval XCC_CREATESHARED=\"${archive_cmds}\" and I want CC to have an old value after the second eval, I need to save it and restore it by hand, like this: CC_save=$CC CC='${CC}' libobjs='$libs' deplibs='${CLFLAGS}' compiler_flags='${CFLAGS}' \ soname='$dll' lib='$lib' output_objdir='$dyndir' \ eval XCC_CREATESHARED=\"${archive_cmds}\" CC=$CC_save however, this does not distinguish between unset CC and CC=''. (is there a way to distinguish these two situations?) Yes - autoconf does this all the time, using an idiom roughly like this: CC_set=${CC+set} CC_save=$CC do stuff that modifies $CC if test "$CC_set" = set; then CC=$CC_save else unset CC fi Also, you can use command to suppress the ability of built-ins like eval (but not function calls) to affect the current environment: $ unset foo $ foo=bar eval : $ echo $foo bar $ unset foo $ foo=bar command eval : $ echo $foo $ -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Why bash command "cd //" set path "//" ?
On 08/26/2011 04:38 AM, Andrey Demykin wrote: Why bash command "cd //" set path "//" ? Because POSIX says that implementations may (but not must) treat // specially. And rather than special case just the implementations that do treat it specially (such as cygwin), bash globally respects // on all platforms even where it is not special. I found this in all version of the bash. Excuse me , if it is not a bug. Not a bug. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Bash does not like DOS file format
On 10/05/2011 02:37 PM, Russ Browne wrote: > Configuration Information [Automatically generated, do not change]: > > Machine: i686 > > OS: cygwin > > Compiler: gcc-4 > > Compilation CFLAGS: -DPROGRAM='bash.exe' -DCONF_HOSTTYPE='i686' > -DCONF_OSTYPE='cygwin' Given that you are trying to use cygwin, this may be a question better asked on the cygwin lists. > I thought this must surely be a simple goof that would be fixed > in the > > next release, but when I reloaded cygwin for my new PC late last > year > > the bug was still there. This is not a bug, but a conscious design decision. In particular, the cygwin port of bash intentionally behaves like Unix in treating CR as literal characters (for speed and POSIX compliance reasons) unless you take explicit measures to request that it ignore CR (your explicit request is what allows bash to ignore POSIX). Read the cygwin bash release notes for more details on the measures you can take: http://sourceware.org/ml/cygwin-announce/2011-02/msg00027.html -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Error in manual for >&word redirection
On 10/12/2011 02:07 PM, Greg Wooledge wrote: Even using a space is not sufficient to force a valid file descriptor number to be treated as a filename: imadev:~$ foo>& 1 stdout stderr imadev:~$ ls -l 1 1 not found If you want 'word' treated as a filename, then express it as a filename. It's still possible to write to a file named '1': foo >&./1 -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: New flag option request
On 10/20/2011 08:48 AM, Bruce Korb wrote: You may have this in the queue already, but just in case: POSIX now specifies that if a standard utility has extended options, then you accomplish it with ``-W option-name[=opt-arg]''. Not quite. POSIX specifies only that -W is reserved for implementation-defined extensions. glibc's getopt_long _happens_ to have the implementation-defined extension that '-W foo' is equivalent to '--foo', so it would make sense that bash support the same extension as glibc for consistency among GNU programs, but that is _not_ a POSIX requirement. I wouldn't care, but I wanted to add ``--noprofile --norc'' to the command line and, for debugging purposes, I aliased "bash" to "bash -x". Oops. Two issues: 1. I'd be nice to be able to interleave short and long options, and 2. "bash -x -W noprofile -W norc" should also work. Bash currently doesn't use getopt_long for option parsing, but rolls its own parser. A patch to make the bash parser support mixed long and short options would also be welcome in my mind. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: What is the correct way to set up login environment in crontab?
On 11/09/2011 10:14 AM, Peng Yu wrote: variable assignment VAR=blah. That sets up a bash-local variable. If you want it to be exported to the environment visible to child processes, then you _also_ need to use export, as in either: VAR=blah export VAR or export VAR=blah However, VAR is not seen in env in /path/programtorun.sh (called from cron). Right - bash maintains two sets of variables, and only the exported ones are visible to children. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org
Re: Customize the command resolution in bash?
On 11/11/2011 03:23 PM, Peng Yu wrote: > Hi, > > bash by default searchs in paths specified in the environment variable > PATH (separated by ":"). I'm not aware if there is any cache mechanism > to save the run time (but even so, different terminals still can not > see the same cache, hence each terminal has the overhead to create the > cache). When there are many files in PATH, it is going to slow down > the performance. > > One simple remedy is to instead search in a file where the abspaths of > all the commands are saved (of course, this database file can be > generated by using the command 'find' to search for all the > directories in $PATH, which process can be scheduled to run > periodically using cron). To make this work, I'm wondering if there is > an easy way to customize the way that bash resolve a command. Bash isn't doing the resolution so much as libc (read 'man execvp') (well, technically, bash may be manually repeating some of the same resolution code as in execvp for other reasons, but the same principles apply). If you want to search fewer directories, then nothing is stopping you from: mkdir ~/mycache for every executable you want cached: ln -s executable ~/mycache/ PATH=$HOME/mycache:$PATH so that you now have a single directory with symlinks to all the executables you want to run; thus, the attempt to stat()/execvp() each executable will hit in the first directory in $PATH rather than getting lots of misses as you progress through each directory in $PATH. But you'll still have to crawl through every directory (including the new one you just added) for resolving things such as 'nosuch' not existing anywhere on $PATH. But whether this provides a measurable speedup, I don't know. Benchmark it yourself if you are interested in trying it. Meanwhile, per POSIX, bash DOES provide hashing once it learns where an executable lives, so that future invocations can rely on the hash (the hash is invalidated when you assign to $PATH). Read up on 'help hash'. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: invoke tilde expansion on quoted string
On 11/12/2011 07:53 AM, Geir Hauge wrote: > 2011/11/12 Chris F.A. Johnson > >> On Fri, 11 Nov 2011, Peng Yu wrote: >> >>> I'm wondering if I already have a string variable, is there a bash >>> native to do tilde expansion on it. >>> >>> var='~/..' >>> cd $var#how to change this line? >>> >> >> eval "cd $var" >> > > I'd avoid eval as that could potentially do more than just expand the > tilde, depending on what other characters the var contains. I'd just > replace the ~ with $HOME using parameter expansion. > > cd "${var/#~\//$HOME/}" Except that your proposed parameter expansion only works for plain ~. It doesn't cover more useful tilde expansions, such as ~user/, which does NOT expand to $HOME, but to "user"'s home directory. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: set -e works incorrectly in subshells
On 11/23/2011 03:26 AM, Марк Коренберг wrote: > Repeat-By: > mmarkk@mmarkk-work:~$ ( set -e; echo aaa; false; echo bbb ) > aaa > mmarkk@mmarkk-work:~$ ( set -e; echo aaa; false; echo bbb ) || true > aaa > bbb > mmarkk@mmarkk-work:~$ ksh has the same behavior, and POSIX requires it (basically, running the subshell on the left of || has a higher precedence than the explicit 'set -e' within the subshell). http://austingroupbugs.net/view.php?id=52 Expected behavior. And one of the arguments I give why using the crutch of 'set -e' is almost always the wrong thing in a complex script. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature