\W and \w doesn't substitute $HOME for ~ properly
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: x86_64-pc-linux-gnu-gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -march=k8 -O2 -pipe uname output: Linux chicago 2.6.17-gentoo-r8 #1 PREEMPT Fri Sep 15 12:15:25 CEST 2006 x86_64 AMD Turion(tm) 64 Mobile Technology ML-30 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 3.1 Patch Level: 17 Release Status: release Description: If a user has it's home directory specified with a trailing slash, such as /home/johndoe/, instead of /home/johndoe, in /etc/passwd, prompting doesn't work properly. If PS1, and perhaps other prompts, has the \w or the \W which should be substituted with a tilde while the user is in his/her home directory, it isn't. Instead it shows "johndoe" or equivalent. Repeat-By: export PS1='\[\033[01;35m\] \W \[\033[0;35m\]\$ \[\033[00m\]' and set your home directory to whatever it already is, but with a trailing slash. Fix: Since there is no definite practice to wether a directory should be specified with or without trailing slash when the directory itself is to be specified, best practice should be to interpret both as the directory. ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
'command' builtin doesn't seem to work for external 'printf'
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.b ashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS uname output: Linux clip 5.10.26-brunch-sebanc #1 SMP PREEMPT Wed Apr 7 21:32:32 CEST 2021 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.1 Patch Level: 4 Release Status: release Description: 'command' builtin doesn't seem to work for external 'printf'. It appears to use the builtin. --- [cet@clip ~]$ printf printf: usage: printf [-v var] format [arguments] [cet@clip ~]$ type -P printf /usr/bin/printf [cet@clip ~]$ /usr/bin/printf /usr/bin/printf: missing operand Try '/usr/bin/printf --help' for more information. [cet@clip ~]$ command printf printf: usage: printf [-v var] format [arguments] --- I believe that running 'command printf' should yield the same result as running '/usr/bin/printf' Repeat-By: (See above)
Re: 'command' builtin doesn't seem to work for external 'printf'
Nevermind! I was misunderstanding what the 'command' builtin does! Thank you, Carl > On 5/17/2021 3:40 PM Carl Thompson wrote: > > > > Configuration Information [Automatically generated, do not change]: > Machine: x86_64 > OS: linux-gnu > Compiler: gcc > Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt > -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' > -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.b > ashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' > -DNON_INTERACTIVE_LOGIN_SHELLS > uname output: Linux clip 5.10.26-brunch-sebanc #1 SMP PREEMPT Wed Apr 7 > 21:32:32 CEST 2021 x86_64 GNU/Linux > Machine Type: x86_64-pc-linux-gnu > > Bash Version: 5.1 > Patch Level: 4 > Release Status: release > > Description: > 'command' builtin doesn't seem to work for external 'printf'. It > appears to use the builtin. > > --- > [cet@clip ~]$ printf > printf: usage: printf [-v var] format [arguments] > > [cet@clip ~]$ type -P printf > /usr/bin/printf > > [cet@clip ~]$ /usr/bin/printf > /usr/bin/printf: missing operand > Try '/usr/bin/printf --help' for more information. > > [cet@clip ~]$ command printf > printf: usage: printf [-v var] format [arguments] > --- > > I believe that running 'command printf' should yield the same result > as running '/usr/bin/printf' > > Repeat-By: > (See above) >
Re: Examples of concurrent coproc usage?
[My apologies up front for the length of this email. The short story is I played around with the multi-coproc support: the fd closing seems to work fine to prevent deadlock, but I found one bug apparently introduced with multi-coproc support, and one other coproc bug that is not new.] On Mon, 11 Mar 2024, Zachary Santer wrote: Was "RFE: enable buffering on null-terminated data" On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist wrote: (Kind of a side-note ... bash's limited coprocess handling was a long standing annoyance for me in the past, to the point that I wrote a bash coprocess management library to handle multiple active coprocess and give convenient methods for interaction. Perhaps the trickiest bit about multiple coprocesses open at once (which I suspect is the reason support was never added to bash) is that you don't want the second and subsequent coprocesses to inherit the pipe fds of prior open coprocesses. This can result in deadlock if, for instance, you close your write end to coproc1, but coproc1 continues to wait for input because coproc2 also has a copy of a write end of the pipe to coproc1's input. So you need to be smart about subsequent coprocesses first closing all fds associated with other coprocesses. https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html Oh hey! Look at that. Thanks for the links to this thread - I gave them a read (along with the old thread from 2011-04). I feel a little bad I missed the 2021 discussion. You're on the money, though there is a preprocessor directive you can build bash with that will allow it to handle multiple concurrent coprocesses without complaining: MULTIPLE_COPROCS=1. Who knew! Thanks for mentioning it. When I saw that "only one active coprocess at a time" was _still_ listed in the bugs section in bash 5, I figured multiple coprocess support had just been abandoned. Chet, that's cool that you implemented it. I kind of went all-out on my bash coprocess management library though (mostly back in 2014-2016) ... It's pretty feature-rich and pleasant to use -- to the point that I don't think there is any going-back to bash's internal coproc for me, even with multiple coprocess are support. I implemented it with shell functions, so it doesn't rely on compiling anything or the latest version of bash being present. (I even added bash3 support for older systems.) Chet Ramey's sticking point was that he hadn't seen coprocesses used enough in the wild to satisfactorily test that his implementation did in fact keep the coproc file descriptors out of subshells. To be fair coproc is kind of a niche feature. But I think more people would play with it if it were less awkward to use and if they felt free to experiment with multiple coprocs. By the way, I agree with the Chet's exact description of the problems here: https://lists.gnu.org/archive/html/help-bash/2021-03/msg00282.html The issue is separate from the stdio buffering discussion; the issue here is with child processes (and I think not foreground subshells, but specifically background processes, including coprocesses) inheriting the shell's fds that are open to pipes connected to an active coprocess. Not getting a sigpipe/write failure results in a coprocess sitting around longer than it ought to, but it's not obvious (to me) how this leads to deadlock, since the shell at least has closed its read end of the pipe to that coprocess, so at least you aren't going to hang trying to read from it. On the other hand, a coprocess not seeing EOF will cause deadlock pretty readily, especially if it processes all its input before producing output (as with wc, sort, sha1sum). Trying to read from the coprocess will hang indefinitely if the coprocess is still waiting for input, which is the case if there is another copy of the write end of its read pipe open somewhere. If you've got examples you can direct him to, I'd really appreciate it. [My original use cases for multiple coprocesses were (1) for programmatically interacting with multiple command-line database clients together, and (2) for talking to multiple interactive command-line game engines (othello) to play each other. Perl's IPC::Open2 works, too, but it's easier to experiment on the fly in bash. And in general having the freedom to play with multiple coprocesses helps mock up more complicated pipelines, or even webs of interconnected processes.] But you can create a deadlock without doing anything fancy. Well, *without multi-coproc support*, here's a simple wc example; first with a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X $ echo $X 0 0 0 This works as expected. But
Re: Examples of concurrent coproc usage?
Hi Chet, thanks for taking the time to review this :D [My apologies again upfront for another lengthy (comprehensive?) email.] On Wed, 3 Apr 2024, Chet Ramey wrote: On 4/2/24 12:22 PM, Carl Edquist wrote: the forked coproc has to close its fds to/from _all_ other existing coprocs (as there can be several). And there is the issue. Without multi-coproc support, the shell only keeps track of one coproc at a time, so there's only one set of pipe file descriptors to close. Right, exactly. The example with the default build (showing the essential case that causes deadlock) was to highlight that your multi-coproc support code apparently does indeed correctly track and close all these fds, and thus prevents the deadlock issue. On Wed, 3 Apr 2024, Chet Ramey wrote: It's straightforward: the coproc process terminates, the shell reaps it, marks it as dead, notifies the user that the process exited, and reaps it before printing the next prompt. I don't observe any different behavior between the default and when compiled for multiple coprocs. It depends on when the process terminates as to whether you get a prompt back and need to run an additional command before reaping the coproc (macOS, RHEL), which gives you the opportunity to run the `read' command: Ah, my mistake then - thanks for explaining. I must have been thrown off by the timing, running it with and without an intervening interactive prompt before the read command. When run interactively, an extra 'Enter' (or not) before the read command changes the behavior. So in that case, this issue (that the shell closes its read-end of the pipe from a reaped coproc, potentially before being able to read the final output) was already there and is not specific to the multi-coproc code. But in any case, it seems like this is a race then? That is, whether the child process terminates before or after the prompt in question. $ coproc WC { wc; } [1] 48057 $ exec {WC[1]}>&- $ read -u ${WC[0]} X [1]+ Donecoproc WC { wc; } bash: DEBUG warning: cpl_reap: deleting 48057 $ echo $X 0 0 0 (I put in a trace statement to show exactly when the coproc gets reaped and deallocated.) Thanks! (for taking the time to play with this) Though apparently it's still a race here. If you diagram the shell and coproc (child) processes, I think you'll see that your DEBUG statement can also happen _before_ the read command, which would then fail. You can contrive this by adding a small sleep (eg, 0.1s) at the end of execute_builtin_or_function (in execute_cmd.c), just before it returns. Eg: diff --git a/execute_cmd.c b/execute_cmd.c index ed1063e..c72f322 100644 --- a/execute_cmd.c +++ b/execute_cmd.c @@ -5535,6 +5535,7 @@ execute_builtin_or_function (words, builtin, var, redirects, discard_unwind_frame ("saved_fifos"); #endif + usleep(10); return (result); } If I do this, I consistently see "read: X: invalid file descriptor specification" running the above 4-line "coproc WC" example in a script, demonstrating that there is no guarantee that the read command will start before the WC coproc is reaped and {WC[0]} is closed, even though it's the next statement after 'exec {WC[1]}>&-'. But (as I'll try to show) you can trip up on this race even without slowing down bash itself artificially. I can't reproduce your results with non-interactive shells, either, with job control enabled or disabled. That's fair; let's try it with a script: $ cat cope.sh #!/bin/bash coproc WC { wc; } jobs exec {WC[1]}>&- [[ $1 ]] && sleep "$1" jobs read -u ${WC[0]} X echo $X Run without sleep, the wc output is seen: $ ./cope.sh [1]+ Running coproc WC { wc; } & [1]+ Running coproc WC { wc; } & 0 0 0 Run with a brief sleep after closing the write end, and it breaks: $ ./cope.sh .1 [1]+ Running coproc WC { wc; } & [1]+ Donecoproc WC { wc; } ./cope.sh: line 8: read: X: invalid file descriptor specification And, if I run with "0" for a sleep time, it intermittently behaves like either of the above. Racy! This is a bug. The shell should not automatically close its read pipe to a coprocess that has terminated -- it should stay open to read the final output, and the user should be responsible for closing the read end explicitly. How long should the shell defer deallocating the coproc after the process terminates? I only offer my opinion here, but it strikes me that it definitely should _not_ be based on an amount of _time_. That's inhere
Re: PATH value doesn't get updated
Yes. I (1) made the change to .bash_profile, then (2) restarted the machine, then (3) logged in again. When I echo $PATH the /opt/lampp/bin is not included. Bob Proulx <[EMAIL PROTECTED]> wrote: carlwenrich wrote: > I put this in my .bash_profile: > > PATH=$PATH:/opt/lampp/bin > export PATH > > but when I "echo $PATH" it doesn't include the /opt/lampp/bin. The .bash_profile is sourced by bash when it is invoked as an interactive login shell. Therefore you would need to log in after having made that change in order for bash to read the file and the effect to be seen. Did you log in after having made that change? Was bash invoked as an interactive login shell so that it would read that file? echo $0 echo $- Bob
Re: PATH value doesn't get updated
echo $0 gives me "bash" echo $- gives me "himBH" Bob Proulx <[EMAIL PROTECTED]> wrote: Carl Wenrich wrote: > Bob Proulx wrote: > > Did you log in after having made that change? Was bash invoked as > > an interactive login shell so that it would read that file? > > > > echo $0 > > echo $- > > Yes. I (1) made the change to .bash_profile, then (2) restarted the > machine, then (3) logged in again. When I echo $PATH the > /opt/lampp/bin is not included. That second question was also very important. Without that information I don't know if I should suggest one action or a different action. Was bash invoked as an interactive login shell so that it would read that file? The way I tell is by running those two commands that I suggested running. What do you get for the following two commands? echo $0 echo $- The $0 is the name used to invoke the shell. If it starts with a '-' then this is used to instruct the shell that it is a login shell. The second variable $- is the flags set to the shell. The 'i' for interactive should be in there. Example: NOT a login shell, will NOT source .bash_profile: $ echo $0 bash $ echo $- himBHP $ Example: Yes, a login shell: $ echo $0 -bash $ echo $- himBHP $ And of course the bash "-l" and "--login" options will override this default behavior. If it is not a login shell then to suggest improvements it would be necessary to know the type of system you are using and how you are logging into it. There are many possibilities and I can't guess which one you might be using. You didn't say in your messages. By the way... It is not necessary to restart your system. That is way too much. Simply log in again to have profile file changes take effect. Bob
Re: PATH value doesn't get updated
I just log into the box that appears on the standard ubuntu startup. I enter my username and password, then the desktop comes up. I see now that the .bash_profile isn't being sourced (I thought it was according to what I've been able to pick up on the web). If I source it manually, the $PATH gets updated. Why does ubuntu provide the .bash_profile when a user is created, and then not source it when the system starts up? And since it doesn't, what do I change to make it happen? Bob Proulx <[EMAIL PROTECTED]> wrote: Carl Wenrich wrote: > echo $0 gives me "bash" > echo $- gives me "himBH" Then bash hasn't been invoked as a login shell and therefore isn't instructed to source the .bash_profile. > If it is not a login shell then to suggest improvements it would be > necessary to know the type of system you are using and how you are > logging into it. There are many possibilities and I can't guess which > one you might be using. You didn't say in your messages. You have yet to say how are you are logging into your machine. There are many possibilities and without information it is impossible to guess. In the hope that it is helpful I will take a shot in the dark... On my HP-UX machine I log into the text console. This gives me a login shell that sources my .bash_profile. I then start X11 using 'xinit' which inherits all of the exported variables. On my Debian GNU/Linux machine I log in using GDM. Because logging in with XDM, or GDM, or KDM doesn't start a login shell I need to tell it this explicitly. I use an executable ~/.xsession file. In it I explicitly tell bash that it is a login shell which causes my .bash_profile to be sourced. Subsequent shells inherit the environment. This is what I use in my personal configuration: #!/bin/bash --login # exec x-session-manager # exec gnome-session # exec startkde exec fvwm2 Red Hat solves this problem in a better way by invoking the user session as a login shell from the system X start up scripts. By doing it that way the user doesn't need to worry about it. Last time I checked SuSE was a problem because it forced sourcing of the $HOME/.bash_profile (or was it .profile?) regardless of the user shell and then redirected all errors to /dev/null effectively ignoring them. Other systems will be similarly different. In any case I believe you have an answer to your question about why your ~/.bash_profile wasn't being sourced. It wasn't being sourced because your shell isn't invoked as a login shell and therefore shouldn't source it. Good luck! Bob
(enhancement request) Limiting depth of xtrace enabled
(I'm sending this request to bug-bash since that's what the GNU/bash documention said to do. I'm also CC'ing the lmod-users community since this would be of interest to them. I don't know how broadly you would want to reply to this.) I use #!/bash -eux in almost all my scripts, so I can quickly figure out if they were doing anything wrong. Using "modules", though, I hit a problem that the module commands all print out huge amounts of output from the environment settings they make. The problem is that "regular" linux commands just print out like + ls but the module command is an alias to a bash function, so all its contents are traced as well. My current workaround is to redefine module (Lmod version) as this function module { set +o xtrace; eval $($LMOD_CMD bash "$@") && eval $(${LMOD_SETTARG_CMD:-:} -s sh); errcode=$?; set -o xtrace; return $errcode; } which temporarily suspends the xtrace output and only prints out these extras + module load pgi *+ set +o xtrace** **+ return 0* + module list I can live with this, but it's still a mess to explain to modules/Lmod users. As far as I can tell, there's no real workaround in the modules/Lmod implementations since direct shell operations are the only way they can work. On the bash side, though, a solution would be to limit the "depth" to which the xtrace setting would apply, so I could trace my top-level commands but not the contents of function calls, or only function calls up to a certain depth. A syntax like this would be ok export XTRACE_DEPTH=1 Would you consider it as an enhancement?
clarify behavior of /dev/fd redirections in docs
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' -DCONF_VENDOR='unknown' -DLOCALEDIR='/scratch/edquist/git/bash/bash/inst/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I.. -I../include -I../lib -g -O2 uname output: Linux monza.cs.wisc.edu 2.6.32-573.7.1.el6.x86_64 #1 SMP Thu Sep 10 13:42:16 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-unknown-linux-gnu Bash Version: 4.3 Patch Level: 46 Release Status: release Description: The docs say: Bash handles several filenames specially when they are used in redirections, as described in the following table: /dev/fd/[fd] If [fd] is a valid integer, file descriptor [fd] is duplicated. But in fact this is only true when HAVE_DEV_FD is undefined (see redir.c lines 522-524). Of course, when HAVE_DEV_FD is defined and the /dev/fd mechanism for opening files is available, the idea is _similar_ to fd duplication, but in fact there are important differences in the semantics: 1. if [fd] refers to a regular file, open(2)ing /dev/fd/[fd] via "command >/dev/fd/[fd]" will truncate the file, and "command >>/dev/fd/[fd]" will open it for append, but duplicating the file descriptor (eg, with "command >&[fd]") will neither truncate nor append, but will leave the file offset untouched. (And, moreover, the offset for the duplicated file descriptor will continue to be shared with [fd]'s after further updates.) 2. relatedly, (and not limited to regular files), it is possible to change the read/write mode of a file descriptor (O_WRONLY -> O_RDONLY) by re-open(2)ing /dev/fd/[fd], but this is not possible when duplicating fd. 3. regardless of file type, open(2)ing /dev/fd/[fd] requires appropriate permissions for open(2) to succeed, which is not required for duplicating a file descriptor. Repeat-By: Examples that demonstrate each of the above differences in semantics: (1) $ exec 5>foo $ echo hello >/dev/fd/5 $ echo goodbye >/dev/fd/5 $ cat foo goodbye $ versus: $ exec 6>bar $ echo hello >&6 $ echo goodbye >&6 $ cat bar hello goodbye $ (2) $ cat baz $ chmod 400 baz $ echo ... >&7 versus: $ echo ... >/dev/fd/7 bash: /dev/fd/7: Permission denied Fix: Just fix the docs to clarify that /dev/fd/[fd] is only handled specially in redirections "on systems that do *not* support the /dev/fd method of naming open files" (to use the same language as the Process Substitution subsection in the docs). Fixing the docs here seems better than changing the code to match the docs, since (1) surely there are people that rely on the existing semantics (whether they realize it or not), despite it being contrary to the docs, and (2) sometimes (as in example 2 above) it is actually useful to open(2) /dev/fd/[fd] instead of duplicating it, and in any case if actual duplication is desired it can still be done with ">&[fd]" style redirections.
Re: clarify behavior of /dev/fd redirections in docs
Hi Chet, thanks for the reply! These semantic differences do exist with the Linux implementation of /dev/fd, but not with other implementations. The BSD and Mac OS X versions, for instance, behave more or less as if they duplicate the file descriptor. Yes, *BSDs have dup semantics for opening /dev/fd/*, so the behavior is implementation-specific. But the point is that bash in fact does not handle /dev/fd/* (or /dev/std{in,out,err}, for that matter) filenames specially at all when they are provided by the OS, while the docs state without qualification that it does handle them specially. You might say that much is a philosophical argument, which can be made even if all OS implementations gave /dev/fd/* dup semantics on open -- the question is whether or not bash itself is doing anything special with them. But on the practical side, the biggest gotcha on Linux is probably that the files /dev/std{out,err} refer to can get truncated ... for example: some_script_or_function () { echo BEGIN STDERR >/dev/stderr command_with_stdout_and_stderr echo END STDERR >/dev/stderr } some_script_or_function >>out.log 2>>err.log The docs imply dup semantics which would append all stderr to err.log, but the behavior on Linux (because bash does not treat them specially and duplicate the fd, as the docs claim it does) is the final redirection truncates err.log and all you get is the "END STDERR" footer. Apart from that, I do in fact personally use this feature of the Linux implementation to re-open fds with /dev/fd/* with different open modes (both for files and pipes, actually). The docs imply that that shouldn't work, but I am glad that it does in fact work. I suppose it's worth saying something about how bash emulates the files on systems that don't provide the feature themselves. Sure! Yeah, that sounds like it would convey both the intention and the actual behavior. Thanks for your time & consideration! Carl On Tue, 12 Jul 2016, Chet Ramey wrote: On 7/11/16 6:02 PM, Carl Edquist wrote: Bash Version: 4.3 Patch Level: 46 Release Status: release Description: The docs say: Bash handles several filenames specially when they are used in redirections, as described in the following table: /dev/fd/[fd] If [fd] is a valid integer, file descriptor [fd] is duplicated. But in fact this is only true when HAVE_DEV_FD is undefined (see redir.c lines 522-524). Of course, when HAVE_DEV_FD is defined and the /dev/fd mechanism for opening files is available, the idea is _similar_ to fd duplication, but in fact there are important differences in the semantics: These semantic differences do exist with the Linux implementation of /dev/fd, but not with other implementations. The BSD and Mac OS X versions, for instance, behave more or less as if they duplicate the file descriptor. For instance, https://github.com/freebsd/freebsd/blob/master/sys/fs/fdescfs/fdesc_vnops.c; in particular fdesc_open(), which sets things up for dupfdopen(), which duplicates a file descriptor. I suppose it's worth saying something about how bash emulates the files on systems that don't provide the feature themselves. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Examples of concurrent coproc usage?
local line fd while read -r line; do for fd; do printf '%s\n' "$line" >&$fd; done; done; } coproc UP { stdbuf -oL tr a-z A-Z; } coproc DOWN { stdbuf -oL tr A-Z a-z; } # make user-managed backup copies of coproc fds exec {up_r}<&${UP[0]} {up_w}>&${UP[1]} exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]} stdout=1 coproc THREEWAY { fdtee $stdout $up_w $down_w; } # save these too, for safe keeping exec {tee_r}<&${THREEWAY[0]} {tee_w}>&${THREEWAY[1]} Then: (actually works) $ echo 'Greetings!' >&$tee_w $ read -u $tee_r plain $ read -u $up_r upped $ read -u $down_r downed $ echo "[$plain] [$upped] [$downed]" [Greetings!] [GREETINGS!] [greetings!] This is a pretty trivial example just to demonstrate the concept. But once you have the freedom to play with it, you find more interesting, useful applications. Of course, for the above technique to be generally useful, external commands need access to these user-managed fds (copied with exec). (I have no idea why the coproc shell closes them.) The shell is crippled when limited to builtins. (I'll try to tidy up some working examples with my coprocess management library this week, for the curious.) Juicy thread hey? I can hardly keep up! :) Carl
Re: Examples of concurrent coproc usage?
On Mon, 15 Apr 2024, Zachary Santer wrote: As much as you can just printf the same thing once for each fd, that doesn't work super well for binary data. Well, you _can_ shovel binary data too: (*) while IFS= read -rd '' X; do printf '%s\0' "$X"; done and use that pattern to make a shell-only version of tee(1) (and I suppose paste(1)). Binary data doesn't work if you're reading newline-terminated records, because you cannot store the NUL character in a shell variable. But you can delimit your records on NULs, and use printf to reproduce them. But the shell is pretty slow when you ask it to shovel data around like this. The 'read' builtin, for instance, cautiously does read(2) calls of a single byte at a time. And printf will write(2) each null-terminated recored unbuffered. Meanwhile cat(1) and tee(1) (which read and write larger blocks) are much faster. But as I said before, the main work of the shell is making it easy to set up pipelines for other (very fast) programs to pass their data around. [(*) It's slow compared to cat(1) or cp(1), but it does work. I had do resort to this once in a botched glibc system upgrade that had wrecked my libc symlinks. A broken libc symlink meant none of the external commands worked at all, so I had to fix everything with only the builtin commands of my existing live root shell. I was able to copy all the (binary) .so files to another directory (/dev/shm) using the above read/printf loop, with the names fixed, then export LD_LIBRARY_PATH to point to the new copies. That then made it possible to run external commands again (ln(1), in particular) to actually fix the symlinks.] I've thought about splitting and recombining pipelines like this, but I've never had a reason to. "If you build, it they will come." ;) Carl
Re: Examples of concurrent coproc usage?
On Tue, 16 Apr 2024, Andreas Schwab wrote: But you can delimit your records on NULs, and use printf to reproduce them. Though that will likely add a spurious null at EOF. On Tue, 16 Apr 2024, Zachary Santer wrote: Just wouldn't copy over whatever might have followed the final null byte, if we're not talking about null-terminated data. You guys are right. Sorry for glossing over that detail. Yes if the file does not end in a NUL byte, the last dangling record still needs to be printed. You can handle it either way with, for example: while IFS= read -rd '' X; do printf '%s\0' "$X"; X=; done [[ $X ]] && printf '%s' "$X" Might've gotten lucky with all those .so files ending in a null byte for whatever reason. Yes that is exactly what happened :) Luckily, on linux anyway, .so files and ELF binaries always seem to end in a null byte. There's no way to force this to give you the equivalent of sized buffers. 'read -N' obviously has the same problem of trying to store the null character in a variable. So, if you're trying to run this on a huge text file, you're going to end up trying to shove that entire file into a variable. Right, that is another reason why it's really not a great solution. Although you can limit the buffer size with, say, 'read -n 4096', and with a bit more handling[1] still get a perfect copy. But that's not my point. My point is, it's one thing to use it in an emergency, but I don't consider it a real usable replacement for cat/tee/paste in general use. Shoveling data around should really be done by an appropriate external program. So in my multi-coproc example, the shell is really crippled if the close-on-exec flags prevent external programs from accessing manual copies of other coproc fds. Carl [1] eg: emergency_maxbuf_cat_monster () ( maxbuf=${1:-4096} fmts=('%s' '%s\0') while IFS= read -rd '' -n $maxbuf X; do printf "${fmts[${#X} < maxbuf]}" "$X"; X=; done [[ ! $X ]] || printf '%s' "$X" )
Re: Examples of concurrent coproc usage?
On Wed, 17 Apr 2024, Chet Ramey wrote: On 4/16/24 2:46 AM, Carl Edquist wrote: But the shell is pretty slow when you ask it to shovel data around like this. The 'read' builtin, for instance, cautiously does read(2) calls of a single byte at a time. It has to do it that way to find the delimiter on a non-seekable file descriptor, since it has to leave everything it didn't consume available on stdin. Understood, and I wouldn't have it any other way. It's entirely appropriate for reading relatively small amounts of data into shell variables. I'm just saying for copying or filtering a substantial amount of data, it's 1000x better to use a suitable external command instead. It makes me cringe a bit and sigh when I see people put something like while read X; do echo "$X"; done in a script, because they somehow imagine it to be more efficient than simply running "cat" . Carl
Re: Examples of concurrent coproc usage?
On Wed, 17 Apr 2024, Chet Ramey wrote: On 4/15/24 1:01 PM, Carl Edquist wrote: Yet another point brought to light by the bcalc example relates to the coproc pid variable. The reset() function first closes the coproc pipe fds, then sleeps for a second to give the BC coproc some time to finish. An alternative might be to 'wait' for the coproc to finish (likely faster than sleeping for a second). If the coproc has some problem and doesn't exit immediately, `wait' without options will hang. That's why I opted for the sleep/kill-as-insurance combo. Yes that much was clear from the script itself. I didn't mean any of that as a critique of the bcalc script. I just meant it brought to light the point that the coproc pid variable is another thing in the current deallocate-on-terminate behavior, that needs to be copied before it can be used reliably. (With the 'kill' or 'wait' builtins.) Though I do suspect that the most common case with coprocs is that closing the shell's read and write fds to the coproc is enough to cause the coproc to finish promptly - as neither read attempts on its stdin nor write attempts on its stdout can block anymore. I think this is _definitely_ true for the BC coproc in the bcalc example. But it's kind of a distraction to get hung up on that detail, because in the general case there may very well be other scenarios where it would be appropriate to, um, _nudge_ the coproc a bit with the kill command. (And before you ask why I didn't use `wait -n', I wrote bcalc in 30 minutes after someone asked me a question about doing floating point math with awk in a shell script, and it worked.) It's fine! It's just an example, after all :) Carl
Re: Examples of concurrent coproc usage?
On Wed, 17 Apr 2024, Chet Ramey wrote: Yes, I agree that coprocs should survive being suspended. The most recent devel branch push has code to prevent the coproc being reaped if it's stopped and not terminated. Oh, nice! :) Carl
Re: Examples of concurrent coproc usage?
On Thu, 18 Apr 2024, Martin D Kealey wrote: On Wed, 17 Apr 2024, Chet Ramey wrote: It has to do it that way to find the delimiter on a non-seekable file descriptor, since it has to leave everything it didn't consume available on stdin. Has anyone tried asking any of the kernel teams (Linux, BSD, or other) to add a new system call such as readln() or readd()? You mean, specifically in order to implement a slightly-more-efficient 'read' builtin in the shell? I envisage this working like stty cooked mode works on a tty, except it would also work on files, pipes, and sockets: you'd get back *at most* as many bytes as you ask for, but you may get fewer if a delimiter is found. The delimiter is consumed (and returned in the buffer), but everything following a delimiter is left available for a subsequent read. One downside is you'd end up with a system call for each token, which is only a little bit better than the 'read' builtin read(2)'ing 1 byte at a time. If your program / shell script is going to be processing a long stream of tokens, it's just going to be more efficient to read(2) a block at a time and do the tokenizing in userspace. And with any luck you can find an appropriate command line utility to do that for you, rather than relying on the shell's 'read' builtin. (Or for your own programs, use getline(3)/getdelim(3), as Chet mentioned.) Carl
Re: Examples of concurrent coproc usage?
ion and never explicitly closed. When the shell session exits, the fds are closed implicitly by the OS, and the coprocess sees EOF and exits on its own.] That's one common model, yes. Another is that the shell process explicitly sends a close or shutdown command to the coproc, so termination is expected. Right, but here also (after sending a quit command) the conclusion is the same as my point just below - that if the user is expecting the coproc to terminate, and expecting the current behavior that as a result the coproc variable will go away automatically, then that variable is as good as forgotten to the user. If a user expects the coproc variable to go away automatically, that user won't be accessing a still-open fd from that variable for anything. I'm more concerned about a pipe with unread data that would potentially cause problems. I suppose we just need more testing. If I understand you right, you are talking about a scenario like this: - a coproc writes to its output pipe - the coproc terminates - the shell leaves its fd for the read end of this pipe open - there is unread data left sitting in this pipe - [theoretical concern here] Is that right? I can't imagine this possibly leading to deadlock. Either (1) the user has forgotten about this pipe, and never attempts to read from it, or (2) the user attempts to read from this pipe, returning some or all of the data, and possibly hitting EOF, but in any case DOES NOT BLOCK. (I'm sorry if this is basically restating what I've already said earlier.) That's more of a "my arm hurts when I do this" situation. If a script opened 500 fds using exec redirection, resource exhaustion would be their own responsibility. Ha, good! [I had a small fear that fd exhaustion might have been your actual concern.] Meanwhile, the bash man page does not specify the shell's behavior for when a coproc terminates, so you might say there's room for interpretation and the new deferring behavior would not break any promises. I could always enable it in the devel branch and see what happens with the folks who use that. It would be three years after any release when distros would put it into production anyway. Oh, fun :) But since you mention it, writing to a broken pipe is still semantically meaningful also. (I would even say valid.) In the typical case it's expected behavior for a process to get killed when it attempts this and shell pipeline programming is designed with this in mind. You'd be surprised at how often I get requests to put in an internal SIGPIPE handler to avoid problems/shell termination with builtins writing to closed pipes. Ah, well, I get it though. It _is_ a bit jarring to see your shell get blown away with something like this - $ exec 9> >(typo) $ ... $ echo >&9 # Boom! So it does not surprise me that you have some users puzzling over it. But FWIW I do think it is the most consistent & correct behavior. Plus, of course, the user can install their own shell handler code for that case, or downgrade the effect to a non-fatal error with $ trap '' SIGPIPE So even for write attempts, you introduce uncertain behavior by automatically closing the fds, when the normal, predictable, valid thing would be to die by SIGPIPE. Again, you might be surprised at how many people view that as a bug in the shell. I'm not terribly surprised, since at first (before reasoning about it) the behavior is admittedly alarming. ("What happened to my terminal?!?!") But I'd argue the alternative is worse, because then it's an unpredictable race between SIGPIPE (which they're complaining about) and EBADF. I think we're talking about our different interpretations of `invalid' (EBADF as opposed to EPIPE/SIGPIPE). Right - just explaining; I think by now we are on the same page. My original intention for the coprocs (and Korn's from whence they came) was that the shell would be in the middle -- it's another way for the shell to do IPC. And coprocesses are great for this, too! It's just that external commands in a sense are extensions of the shell. The arms and legs, you might say, for doing the heavy lifting. Carl
Re: Examples of concurrent coproc usage?
On Mon, 22 Apr 2024, Chet Ramey wrote: You might be surprised. The OP was sending thousands of calculations to (I think) GNU bc, which had some resource consumption issue that resulted in it eventually hanging, unresponsive. The kill was the solution there. I imagine there are similar scenarios with other tools. Ok, you got me! I take it back. I hadn't considered bc operations being cpu/memory intensive. But that possibility makes sense - given that it's arbitrary precision I guess you can ask for a number to the billionth power and never see the end of it :) Carl
bash 3.2.0 broken ???
Hello folkz ?! please check out if bash 3.2.0 is broken ??? as root after TAB-TAB only 2178 possibilities are shown ??? Should be about 3.300 possibilities, or not ??? Greetings. Carl.
regex in bashexpression doesn't work
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv uname output: Linux in 2.6.32-220.17.1.el6.x86_64 #1 SMP Wed May 16 00:01:37 BST 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.1 Patch Level: 2 Release Status: release Description: [[ 'foobar' =~ 'o.a' ]] && echo FOUND should print FOUND but it doesn't. This have worked in previeous versions of bash. The bash used comes with CentOS 6.2 Repeat-By: No privous commands necessary to generate this problem. Fix:
Re: Examples of concurrent coproc usage?
On Mon, 1 Apr 2024, Chet Ramey wrote: On 4/1/24 3:24 PM, Chet Ramey wrote: On 3/14/24 5:58 AM, Carl Edquist wrote: Well, *without multi-coproc support*, here's a simple wc example; first with a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X $ echo $X 0 0 0 This works as expected. But if you try it with a second coproc (again, without multi-coproc support), the second coproc will inherit copies of the shell's read and write pipe fds to the first coproc, and the read will hang (as described above), as the first coproc doesn't see EOF: $ coproc WC { wc; } $ coproc CAT { cat; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X # HANGS But, this can be observed even before attempting the read that hangs. Let's see if we can tackle these one at a time. This seems like it would be pretty easy to fix if a coproc closed the fds corresponding to an existing coproc in the child after the fork. That wouldn't really change anything regarding how scripts have to manually manage multiple coprocs, but it will prevent the shell from hanging. I sent this before I was ready. This would be equivalent to changing the commands to use coproc CAT { exec {WC[0]}<&- {WC[1]}>&- ; cat; } but the script writer wouldn't have to manage it. Agreed. And just to note two things (in case it wasn't clear) - (1) the above example that hangs is with the default bash, compiled _without_ multi-coproc support; and (2): This seems like it would be pretty easy to fix if a coproc closed the fds corresponding to an existing coproc in the child after the fork the forked coproc has to close its fds to/from _all_ other existing coprocs (as there can be several). Carl
Re: Examples of concurrent coproc usage?
On Mon, 8 Apr 2024, Chet Ramey wrote: On 4/4/24 8:52 AM, Carl Edquist wrote: Zack illustrated basically the same point with his example: exec {fd}< <( some command ) while IFS='' read -r line <&"${fd}"; do # do stuff done {fd}<&- A process-substitution open to the shell like this is effectively a one-ended coproc (though not in the jobs list), and it behaves reliably here because the user can count on {fd} to remain open even after the child process terminates. That exposes the fundamental difference. The procsub is essentially the same kind of object as a coproc, but it exposes the pipe endpoint(s) as filenames. The shell maintains open file descriptors to the child process whose input or output it exposes as a FIFO or a file in /dev/fd, since you have to have a reader and a writer. The shell closes the file descriptor and, if necessary, removes the FIFO when the command for which that was one of the word expansions (or a redirection) completes. coprocs are designed to be longer-lived, and not associated with a particular command or redirection. But the important piece is that $fd is not the file descriptor the shell keeps open to the procsub -- it's a new file descriptor, dup'd from the original by the redirection. Since it was used with `exec', it persists until the script explicitly closes it. It doesn't matter when the shell reaps the procsub and closes the file descriptor(s) -- the copy in $fd remains until the script explicitly closes it. You might get read returning failure at some point, but the shell won't close $fd for you. Since procsubs expand to filenames, even opening them is sufficient to give you a new file descriptor (with the usual caveats about how different OSs handle the /dev/fd device). You can do this yourself with coprocs right now, with no changes to the shell. So, the user can determine when the coproc fds are no longer needed, whether that's when EOF is hit trying to read from the coproc, or whatever other condition. Duplicating the file descriptor will do that for you. Thanks for the explanation, that all makes sense. One technical difference in my mind is that doing this with a procsub is reliably safe: exec {fd}< <( some command ) since the expanded pathname (/dev/fd/N or the fifo alternative) will stay around for the duration of the exec command, so there is no concern about whether or not the dup redirection will succeed. Where with a coproc coproc X { potentially short lived command with output; } exec {xr}<&${X[0]} {xw}>&${X[1]} there is technically the possibility that the coproc can finish and be reaped before the exec command gets a chance to run and duplicate the fds. But, I also get what you said, that your design intent with coprocs was for them to be longer-lived, so immediate termination was not a concern. Personally I like the idea of 'closing' a coproc explicitly, but if it's a bother to add options to the coproc keyword, then I would say just let the user be responsible for closing the fds. Once the coproc has terminated _and_ the coproc's fds are closed, then the coproc can be deallocated. This is not backwards compatible. coprocs may be a little-used feature, but you're adding a burden on the shell programmer that wasn't there previously. Ok, so, I'm trying to imagine a case where this would cause any problems or extra work for such an existing user. Maybe you can provide an example from your own uses? (Where it would cause trouble or require adding code if the coproc deallocation were deferred until the fds are closed explicitly.) My first thought is that in the general case, the user doesn't really need to worry much about closing the fds for a terminated coproc anyway, as they will all be closed implicitly when the shell exits (either an interactive session or a script). [This is a common model for using coprocs, by the way, where an auxiliary coprocess is left open for the lifetime of the shell session and never explicitly closed. When the shell session exits, the fds are closed implicitly by the OS, and the coprocess sees EOF and exits on its own.] If a user expects the coproc variable to go away automatically, that user won't be accessing a still-open fd from that variable for anything. As for the forgotten-about half-closed pipe fds to the reaped coproc, I don't see how they could lead to deadlock, nor do I see how a shell programmer expecting the existing behavior would even attempt to access them at all, apart from programming error. The only potential issue I can imagine is if a script (or a user at an interactive prompt) would start _so_ many of these longer-lived coprocs (more than 500??), one at a time in succession, in a single shell session, that
Re: Examples of concurrent coproc usage?
ds open across exec; though for now I will probably continue setting up pipes in the shell by methods other than the coproc keyword. Cheers, Carl
Re: Examples of concurrent coproc usage?
On Mon, 22 Apr 2024, Martin D Kealey wrote: On Sun, 21 Apr 2024, 10:13 Carl Edquist, wrote: You mean, specifically in order to implement a slightly-more-efficient 'read' builtin in the shell? The read built-in in the shell is only one case that would benefit from such a syscall. The purpose would be to allow multiple processes to read in turn from a consumable (or otherwise non seekable) input stream. In this context doing a large block read() is exactly what we DON'T want to do, so we also can't use a library function such as getline() that is built on top of such a read(). By way of example, another use would be the "head" utility, which by using such a syscall could consume only the bytes it outputs, leaving all other bytes still in the input stream. This would be an improvement over the current situation. Basically any time you have cooperating processes reading delimited input, this would be an improvement. Makes sense! I envisage this working like stty cooked mode works on a tty, … One downside is you'd end up with a system call for each token That's not how stty cooked mode normally works. The typical use case is line-at-a-time, so this would reduce the number of system calls by about 90% on a typical text input stream, more if there are few or no blank lines. However I would not hard code "newline" into the kernel, but rather allow the user code to nominate a list of delimiters. When I say "token" I just mean a record with whatever delimiter you're using. Assuming the reading stops after consuming the first delimiter (which is necessary for the 'read' builtin), then you end up with one system call per line or record or token or whatever you want to call it. A line at a time is an improvement over a byte at a time, but if/when you can find a way to do what you want with the default block buffering & userspace tokenizing, you'll reduce the number of system calls by 99.99%. (Which is one reason I was saying the shell is crippled when limited to builtins; eg, a read/printf loop compared to simply running cat.) ... But yeah currently a pipe with a series of records and multiple cooperating/competing readers perhaps only works if the records have a fixed size. A new readd[elim] system call like you're talking about would allow safely reading a single variable-length record at a time. So by all means, feel free to take the idea to your friendly neighborhood kernel community, if you'd like to pursue it ;) Carl
Re: Examples of concurrent coproc usage?
On Mon, 22 Apr 2024, Martin D Kealey wrote: On Mon, 22 Apr 2024, 09:17 Carl Edquist, wrote: […] I was saying the shell is crippled when limited to builtins; eg, a read/printf loop compared to simply running cat. I would hope that mapfile/readarray could do better, since it's not obligated to leave anything in the input stream. That is an interesting thought, although mapfile seems to read a byte at a time also. [I'm not suggesting this should be changed though. And frankly I have no strong desire for a faster 'read' builtin in the shell, either. A byte at a time is relatively slow, but that's generally fine in most limited contexts where it's actually needed.] But yeah currently a pipe with a series of records and multiple cooperating/competing readers perhaps only works if the records have a fixed size. A new readd[elim] system call like you're talking about would allow safely reading a single variable-length record at a time. There are other options, such as length-prefixed records, or tagged (typed) records, but of course those aren't POSIX text files. That'd work for "cooperating" readers (as you put it) where they are not attempting to read at the same time. Though reading a single byte at a time also works in that case. My thought is that (performance considerations aside), the real functional improvement with a new "readd" call would be with _competing_ readers (more than one read call waiting on the same pipe at the same time). In that case a length-prefixed or type-tagged record wouldn't seem to work with the regular read(2), because a single reader would not be able to read the length/type _and_ the corresponding record together. You can't work around this by reading a byte at a time either. That's why I said it would only seem to work (with read(2)) if the records have a fixed size. (In order to grab a whole record atomically.) But a new "readd" call would allow multiple competing readers to read, say, a stream of filenames from a pipe, without having to pad each one to PATH_MAX bytes. It seems that if there is only one reader at a given time though ("cooperating"), then it's just a matter of performance between read(2)'ing one byte at a time vs using a new readd call. ... I'm not trying to advocate for or against you contacting the kernel folks with your idea; it just seems to me that the scenario with multiple competing readers might be the strongest argument for it. Carl