\W and \w doesn't substitute $HOME for ~ properly

2006-09-30 Thread carl
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: x86_64-pc-linux-gnu-gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -march=k8 -O2 -pipe
uname output: Linux chicago 2.6.17-gentoo-r8 #1 PREEMPT Fri Sep 15 12:15:25 
CEST 2006 x86_64 AMD Turion(tm) 64 Mobile Technology ML-30 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 3.1
Patch Level: 17
Release Status: release

Description:
If a user has it's home directory specified with a trailing
slash, such as /home/johndoe/, instead of /home/johndoe, in
/etc/passwd, prompting doesn't work properly.

If PS1, and perhaps other prompts, has the \w or the \W
which should be substituted with a tilde while the user is
in his/her home directory, it isn't. Instead it shows
"johndoe" or equivalent.

Repeat-By:
export PS1='\[\033[01;35m\] \W \[\033[0;35m\]\$ \[\033[00m\]'
and set your home directory to whatever it already is, but
with a trailing slash.

Fix:
Since there is no definite practice to wether a directory
should be specified with or without trailing slash when
the directory itself is to be specified, best practice should
be to interpret both as the directory.


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash


'command' builtin doesn't seem to work for external 'printf'

2021-05-17 Thread Carl Thompson
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt 
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' 
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.b
ashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS 
uname output: Linux clip 5.10.26-brunch-sebanc #1 SMP PREEMPT Wed Apr 7 
21:32:32 CEST 2021 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.1
Patch Level: 4
Release Status: release

Description:
'command' builtin doesn't seem to work for external 'printf'. It appears to use 
the builtin.

---
[cet@clip ~]$ printf
printf: usage: printf [-v var] format [arguments]

[cet@clip ~]$ type -P printf
/usr/bin/printf

[cet@clip ~]$ /usr/bin/printf
/usr/bin/printf: missing operand
Try '/usr/bin/printf --help' for more information.

[cet@clip ~]$ command printf
printf: usage: printf [-v var] format [arguments]
---

I believe that running 'command printf' should yield the same result as running 
'/usr/bin/printf'

Repeat-By:
(See above)


Re: 'command' builtin doesn't seem to work for external 'printf'

2021-05-17 Thread Carl Thompson
Nevermind! I was misunderstanding what the 'command' builtin does!

Thank you,
Carl

> On 5/17/2021 3:40 PM Carl Thompson  wrote:
> 
> 
> 
> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt 
> -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' 
> -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.b
> ashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' 
> -DNON_INTERACTIVE_LOGIN_SHELLS 
> uname output: Linux clip 5.10.26-brunch-sebanc #1 SMP PREEMPT Wed Apr 7 
> 21:32:32 CEST 2021 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
> 
> Bash Version: 5.1
> Patch Level: 4
> Release Status: release
> 
> Description:
> 'command' builtin doesn't seem to work for external 'printf'. It 
> appears to use the builtin.
> 
> ---
> [cet@clip ~]$ printf
> printf: usage: printf [-v var] format [arguments]
> 
> [cet@clip ~]$ type -P printf
> /usr/bin/printf
> 
> [cet@clip ~]$ /usr/bin/printf
> /usr/bin/printf: missing operand
> Try '/usr/bin/printf --help' for more information.
> 
> [cet@clip ~]$ command printf
> printf: usage: printf [-v var] format [arguments]
> ---
> 
> I believe that running 'command printf' should yield the same result 
> as running '/usr/bin/printf'
> 
> Repeat-By:
> (See above)
> 


Re: Examples of concurrent coproc usage?

2024-03-14 Thread Carl Edquist
[My apologies up front for the length of this email.  The short story is I 
played around with the multi-coproc support: the fd closing seems to work 
fine to prevent deadlock, but I found one bug apparently introduced with 
multi-coproc support, and one other coproc bug that is not new.]


On Mon, 11 Mar 2024, Zachary Santer wrote:


Was "RFE: enable buffering on null-terminated data"

On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist  wrote:


(Kind of a side-note ... bash's limited coprocess handling was a long 
standing annoyance for me in the past, to the point that I wrote a bash 
coprocess management library to handle multiple active coprocess and 
give convenient methods for interaction.  Perhaps the trickiest bit 
about multiple coprocesses open at once (which I suspect is the reason 
support was never added to bash) is that you don't want the second and 
subsequent coprocesses to inherit the pipe fds of prior open 
coprocesses.  This can result in deadlock if, for instance, you close 
your write end to coproc1, but coproc1 continues to wait for input 
because coproc2 also has a copy of a write end of the pipe to coproc1's 
input.  So you need to be smart about subsequent coprocesses first 
closing all fds associated with other coprocesses.


https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html
https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html


Oh hey!  Look at that.  Thanks for the links to this thread - I gave them 
a read (along with the old thread from 2011-04).  I feel a little bad I 
missed the 2021 discussion.



You're on the money, though there is a preprocessor directive you can 
build bash with that will allow it to handle multiple concurrent 
coprocesses without complaining: MULTIPLE_COPROCS=1.


Who knew!  Thanks for mentioning it.  When I saw that "only one active 
coprocess at a time" was _still_ listed in the bugs section in bash 5, I 
figured multiple coprocess support had just been abandoned.  Chet, that's 
cool that you implemented it.


I kind of went all-out on my bash coprocess management library though 
(mostly back in 2014-2016) ... It's pretty feature-rich and pleasant to 
use -- to the point that I don't think there is any going-back to bash's 
internal coproc for me, even with multiple coprocess are support.  I 
implemented it with shell functions, so it doesn't rely on compiling 
anything or the latest version of bash being present.  (I even added bash3 
support for older systems.)


Chet Ramey's sticking point was that he hadn't seen coprocesses used 
enough in the wild to satisfactorily test that his implementation did in 
fact keep the coproc file descriptors out of subshells.


To be fair coproc is kind of a niche feature.  But I think more people 
would play with it if it were less awkward to use and if they felt free to 
experiment with multiple coprocs.


By the way, I agree with the Chet's exact description of the problems 
here:


https://lists.gnu.org/archive/html/help-bash/2021-03/msg00282.html

The issue is separate from the stdio buffering discussion; the issue here 
is with child processes (and I think not foreground subshells, but 
specifically background processes, including coprocesses) inheriting the 
shell's fds that are open to pipes connected to an active coprocess.


Not getting a sigpipe/write failure results in a coprocess sitting around 
longer than it ought to, but it's not obvious (to me) how this leads to 
deadlock, since the shell at least has closed its read end of the pipe to 
that coprocess, so at least you aren't going to hang trying to read from 
it.


On the other hand, a coprocess not seeing EOF will cause deadlock pretty 
readily, especially if it processes all its input before producing output 
(as with wc, sort, sha1sum).  Trying to read from the coprocess will hang 
indefinitely if the coprocess is still waiting for input, which is the 
case if there is another copy of the write end of its read pipe open 
somewhere.




If you've got examples you can direct him to, I'd really appreciate it.


[My original use cases for multiple coprocesses were (1) for 
programmatically interacting with multiple command-line database clients 
together, and (2) for talking to multiple interactive command-line game 
engines (othello) to play each other.


Perl's IPC::Open2 works, too, but it's easier to experiment on the fly in 
bash.


And in general having the freedom to play with multiple coprocesses helps 
mock up more complicated pipelines, or even webs of interconnected 
processes.]


But you can create a deadlock without doing anything fancy.


Well, *without multi-coproc support*, here's a simple wc example; first 
with a single coproc:


$ coproc WC { wc; }
$ exec {WC[1]}>&-
$ read -u ${WC[0]} X
$ echo $X
0 0 0

This works as expected.

But 

Re: Examples of concurrent coproc usage?

2024-04-04 Thread Carl Edquist

Hi Chet, thanks for taking the time to review this  :D

[My apologies again upfront for another lengthy (comprehensive?) email.]


On Wed, 3 Apr 2024, Chet Ramey wrote:


On 4/2/24 12:22 PM, Carl Edquist wrote:


 the forked coproc has to close its fds to/from _all_ other existing
 coprocs (as there can be several).


And there is the issue. Without multi-coproc support, the shell only 
keeps track of one coproc at a time, so there's only one set of pipe 
file descriptors to close.


Right, exactly.  The example with the default build (showing the essential 
case that causes deadlock) was to highlight that your multi-coproc support 
code apparently does indeed correctly track and close all these fds, and 
thus prevents the deadlock issue.



On Wed, 3 Apr 2024, Chet Ramey wrote:

It's straightforward: the coproc process terminates, the shell reaps it, 
marks it as dead, notifies the user that the process exited, and reaps 
it before printing the next prompt. I don't observe any different 
behavior between the default and when compiled for multiple coprocs.


It depends on when the process terminates as to whether you get a prompt 
back and need to run an additional command before reaping the coproc 
(macOS, RHEL), which gives you the opportunity to run the `read' 
command:


Ah, my mistake then - thanks for explaining.  I must have been thrown off 
by the timing, running it with and without an intervening interactive 
prompt before the read command.  When run interactively, an extra 'Enter' 
(or not) before the read command changes the behavior.


So in that case, this issue (that the shell closes its read-end of the 
pipe from a reaped coproc, potentially before being able to read the final 
output) was already there and is not specific to the multi-coproc code.


But in any case, it seems like this is a race then?  That is, whether the 
child process terminates before or after the prompt in question.



$ coproc WC { wc; }
[1] 48057
$ exec {WC[1]}>&-
$ read -u ${WC[0]} X
[1]+  Donecoproc WC { wc; }
bash: DEBUG warning: cpl_reap: deleting 48057
$ echo $X
0 0 0

(I put in a trace statement to show exactly when the coproc gets reaped and
deallocated.)


Thanks! (for taking the time to play with this)

Though apparently it's still a race here.  If you diagram the shell and 
coproc (child) processes, I think you'll see that your DEBUG statement can 
also happen _before_ the read command, which would then fail.  You can 
contrive this by adding a small sleep (eg, 0.1s) at the end of 
execute_builtin_or_function (in execute_cmd.c), just before it returns.


Eg:

diff --git a/execute_cmd.c b/execute_cmd.c
index ed1063e..c72f322 100644
--- a/execute_cmd.c
+++ b/execute_cmd.c
@@ -5535,6 +5535,7 @@ execute_builtin_or_function (words, builtin, var, 
redirects,
   discard_unwind_frame ("saved_fifos");
 #endif

+  usleep(10);
   return (result);
 }


If I do this, I consistently see "read: X: invalid file descriptor 
specification" running the above 4-line "coproc WC" example in a script, 
demonstrating that there is no guarantee that the read command will start 
before the WC coproc is reaped and {WC[0]} is closed, even though it's the 
next statement after 'exec {WC[1]}>&-'.


But (as I'll try to show) you can trip up on this race even without 
slowing down bash itself artificially.



I can't reproduce your results with non-interactive shells, either, with 
job control enabled or disabled.


That's fair; let's try it with a script:

$ cat cope.sh
#!/bin/bash

coproc WC { wc; }
jobs
exec {WC[1]}>&-
[[ $1 ]] && sleep "$1"
jobs
read -u ${WC[0]} X
echo $X


Run without sleep, the wc output is seen:

$ ./cope.sh
[1]+  Running coproc WC { wc; } &
[1]+  Running coproc WC { wc; } &
0 0 0


Run with a brief sleep after closing the write end, and it breaks:

$ ./cope.sh .1
[1]+  Running coproc WC { wc; } &
[1]+  Donecoproc WC { wc; }
./cope.sh: line 8: read: X: invalid file descriptor specification


And, if I run with "0" for a sleep time, it intermittently behaves like 
either of the above.  Racy!



This is a bug.  The shell should not automatically close its read pipe 
to a coprocess that has terminated -- it should stay open to read the 
final output, and the user should be responsible for closing the read 
end explicitly.


How long should the shell defer deallocating the coproc after the 
process terminates?


I only offer my opinion here, but it strikes me that it definitely should 
_not_ be based on an amount of _time_.  That's inhere

Re: PATH value doesn't get updated

2008-05-12 Thread Carl Wenrich
Yes. I (1) made the change to .bash_profile, then (2) restarted the machine, 
then (3) logged in again. When I echo $PATH the /opt/lampp/bin is not included.

Bob Proulx <[EMAIL PROTECTED]> wrote: carlwenrich wrote:
> I put this in my .bash_profile:
> 
> PATH=$PATH:/opt/lampp/bin
> export PATH
> 
> but when I "echo $PATH" it doesn't include the /opt/lampp/bin.

The .bash_profile is sourced by bash when it is invoked as an
interactive login shell.  Therefore you would need to log in after
having made that change in order for bash to read the file and the
effect to be seen.  Did you log in after having made that change?  Was
bash invoked as an interactive login shell so that it would read that
file?

  echo $0
  echo $-

Bob



Re: PATH value doesn't get updated

2008-05-12 Thread Carl Wenrich
echo $0 gives me "bash"
echo $- gives me "himBH"

Bob Proulx <[EMAIL PROTECTED]> wrote: Carl Wenrich wrote:
> Bob Proulx wrote:
> > Did you log in after having made that change?  Was bash invoked as
> > an interactive login shell so that it would read that file?
> > 
> >   echo $0
> >   echo $-
>
> Yes. I (1) made the change to .bash_profile, then (2) restarted the
> machine, then (3) logged in again. When I echo $PATH the
> /opt/lampp/bin is not included.

That second question was also very important.  Without that
information I don't know if I should suggest one action or a different
action.

  Was bash invoked as an interactive login shell so that it would read
  that file?

The way I tell is by running those two commands that I suggested
running.  What do you get for the following two commands?

  echo $0
  echo $-

The $0 is the name used to invoke the shell.  If it starts with a '-'
then this is used to instruct the shell that it is a login shell.  The
second variable $- is the flags set to the shell.  The 'i' for
interactive should be in there.

Example: NOT a login shell, will NOT source .bash_profile:

  $ echo $0
  bash
  $ echo $-
  himBHP
  $

Example: Yes, a login shell:

  $ echo $0
  -bash
  $ echo $-
  himBHP
  $

And of course the bash "-l" and "--login" options will override this
default behavior.

If it is not a login shell then to suggest improvements it would be
necessary to know the type of system you are using and how you are
logging into it.  There are many possibilities and I can't guess which
one you might be using.  You didn't say in your messages.

By the way... It is not necessary to restart your system.  That is way
too much.  Simply log in again to have profile file changes take
effect.

Bob



Re: PATH value doesn't get updated

2008-05-12 Thread Carl Wenrich
I just log into the box that appears on the standard ubuntu startup. I enter my 
username and password, then the desktop comes up.

I see now that the .bash_profile isn't being sourced (I thought it was 
according to what I've been able to pick up on the web). If I source it 
manually, the $PATH gets updated.

Why does ubuntu provide the .bash_profile when a user is created, and then not 
source it when the system starts up? And since it doesn't, what do I change to 
make it happen?

Bob Proulx <[EMAIL PROTECTED]> wrote: Carl Wenrich wrote:
> echo $0 gives me "bash"
> echo $- gives me "himBH"

Then bash hasn't been invoked as a login shell and therefore isn't
instructed to source the .bash_profile.

> If it is not a login shell then to suggest improvements it would be
> necessary to know the type of system you are using and how you are
> logging into it.  There are many possibilities and I can't guess which
> one you might be using.  You didn't say in your messages.

You have yet to say how are you are logging into your machine.  There
are many possibilities and without information it is impossible to
guess.

In the hope that it is helpful I will take a shot in the dark...

On my HP-UX machine I log into the text console.  This gives me a
login shell that sources my .bash_profile.  I then start X11 using
'xinit' which inherits all of the exported variables.  On my Debian
GNU/Linux machine I log in using GDM.  Because logging in with XDM, or
GDM, or KDM doesn't start a login shell I need to tell it this
explicitly.  I use an executable ~/.xsession file.  In it I explicitly
tell bash that it is a login shell which causes my .bash_profile to be
sourced.  Subsequent shells inherit the environment.  This is what I
use in my personal configuration:

  #!/bin/bash --login
  # exec x-session-manager
  # exec gnome-session
  # exec startkde
  exec fvwm2

Red Hat solves this problem in a better way by invoking the user
session as a login shell from the system X start up scripts.  By doing
it that way the user doesn't need to worry about it.  Last time I
checked SuSE was a problem because it forced sourcing of the
$HOME/.bash_profile (or was it .profile?)  regardless of the user
shell and then redirected all errors to /dev/null effectively ignoring
them.  Other systems will be similarly different.

In any case I believe you have an answer to your question about why
your ~/.bash_profile wasn't being sourced.  It wasn't being sourced
because your shell isn't invoked as a login shell and therefore
shouldn't source it.

Good luck!

Bob



(enhancement request) Limiting depth of xtrace enabled

2018-05-05 Thread Carl Ponder
(I'm sending this request to bug-bash since that's what the GNU/bash 
documention said to do.
I'm also CC'ing the lmod-users community since this would be of interest 
to them.

I don't know how broadly you would want to reply to this.)

I use #!/bash -eux in almost all my scripts, so I can quickly figure out 
if they were doing anything wrong.
Using "modules", though, I hit a problem that the module commands all 
print out huge amounts of output from the environment settings they make.

The problem is that "regular" linux commands just print out like

   + ls

but the module command is an alias to a bash function, so all its 
contents are traced as well.

My current workaround is to redefine module (Lmod version) as this

   function module
   {   set +o xtrace; eval $($LMOD_CMD bash "$@") && eval
   $(${LMOD_SETTARG_CMD:-:} -s sh); errcode=$?; set -o xtrace; return
   $errcode; }

which temporarily suspends the xtrace output and only prints out these 
extras


   + module load pgi
   *+ set +o xtrace**
   **+ return 0*
   + module list

I can live with this, but it's still a mess to explain to modules/Lmod 
users.
As far as I can tell, there's no real workaround in the modules/Lmod 
implementations since direct shell operations are the only way they can 
work.
On the bash side, though, a solution would be to limit the "depth" to 
which the xtrace setting would apply, so I could trace my top-level 
commands but not the contents of function calls, or only function calls 
up to a certain depth.

A syntax like this would be ok

export XTRACE_DEPTH=1

Would you consider it as an enhancement?



clarify behavior of /dev/fd redirections in docs

2016-07-12 Thread Carl Edquist

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' 
-DCONF_VENDOR='unknown' 
-DLOCALEDIR='/scratch/edquist/git/bash/bash/inst/share/locale' -DPACKAGE='bash' 
-DSHELL -DHAVE_CONFIG_H   -I.  -I.. -I../include -I../lib   -g -O2
uname output: Linux monza.cs.wisc.edu 2.6.32-573.7.1.el6.x86_64 #1 SMP Thu Sep 
10 13:42:16 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-unknown-linux-gnu

Bash Version: 4.3
Patch Level: 46
Release Status: release

Description:
The docs say:

Bash handles several filenames specially when they are
used in redirections, as described in the following table:

/dev/fd/[fd]
If [fd] is a valid integer, file descriptor [fd]
is duplicated.

But in fact this is only true when HAVE_DEV_FD is undefined (see
redir.c lines 522-524).

Of course, when HAVE_DEV_FD is defined and the /dev/fd mechanism
for opening files is available, the idea is _similar_ to fd
duplication, but in fact there are important differences in the
semantics:

1. if [fd] refers to a regular file, open(2)ing /dev/fd/[fd] via
"command >/dev/fd/[fd]" will truncate the file, and
"command >>/dev/fd/[fd]" will open it for append, but duplicating
the file descriptor (eg, with "command >&[fd]") will neither
truncate nor append, but will leave the file offset untouched.
(And, moreover, the offset for the duplicated file descriptor will
continue to be shared with [fd]'s after further updates.)

2. relatedly, (and not limited to regular files), it is possible
to change the read/write mode of a file descriptor (O_WRONLY ->
O_RDONLY) by re-open(2)ing /dev/fd/[fd], but this is not possible
when duplicating fd.

3. regardless of file type, open(2)ing /dev/fd/[fd] requires
appropriate permissions for open(2) to succeed, which is not
required for duplicating a file descriptor.

Repeat-By:
Examples that demonstrate each of the above differences in
semantics:

(1)
$ exec 5>foo
$ echo hello >/dev/fd/5
$ echo goodbye >/dev/fd/5
$ cat foo
goodbye
$

versus:

$ exec 6>bar
$ echo hello >&6
$ echo goodbye >&6
$ cat bar
hello
goodbye
$

(2)
$ cat baz
$ chmod 400 baz
$ echo ... >&7

versus:

$ echo ... >/dev/fd/7
bash: /dev/fd/7: Permission denied

Fix:
Just fix the docs to clarify that /dev/fd/[fd] is only handled
specially in redirections "on systems that do *not* support the
/dev/fd method of naming open files" (to use the same language as
the Process Substitution subsection in the docs).

Fixing the docs here seems better than changing the code to match
the docs, since (1) surely there are people that rely on the
existing semantics (whether they realize it or not), despite it
being contrary to the docs, and (2) sometimes (as in example 2
above) it is actually useful to open(2) /dev/fd/[fd] instead of
duplicating it, and in any case if actual duplication is desired
it can still be done with ">&[fd]" style redirections.



Re: clarify behavior of /dev/fd redirections in docs

2016-07-12 Thread Carl Edquist

Hi Chet, thanks for the reply!


These semantic differences do exist with the Linux implementation of
/dev/fd, but not with other implementations.  The BSD and Mac OS X
versions, for instance, behave more or less as if they duplicate the
file descriptor.


Yes, *BSDs have dup semantics for opening /dev/fd/*, so the behavior is 
implementation-specific.  But the point is that bash in fact does not 
handle /dev/fd/* (or /dev/std{in,out,err}, for that matter) filenames 
specially at all when they are provided by the OS, while the docs state 
without qualification that it does handle them specially.


You might say that much is a philosophical argument, which can be made 
even if all OS implementations gave /dev/fd/* dup semantics on open -- the 
question is whether or not bash itself is doing anything special with 
them.


But on the practical side, the biggest gotcha on Linux is probably that 
the files /dev/std{out,err} refer to can get truncated ... for example:


some_script_or_function () {
echo BEGIN STDERR >/dev/stderr

command_with_stdout_and_stderr

echo END STDERR >/dev/stderr
}

some_script_or_function >>out.log 2>>err.log

The docs imply dup semantics which would append all stderr to err.log, but 
the behavior on Linux (because bash does not treat them specially and 
duplicate the fd, as the docs claim it does) is the final redirection 
truncates err.log and all you get is the "END STDERR" footer.



Apart from that, I do in fact personally use this feature of the Linux 
implementation to re-open fds with /dev/fd/* with different open modes 
(both for files and pipes, actually).  The docs imply that that shouldn't 
work, but I am glad that it does in fact work.



I suppose it's worth saying something about how bash emulates the files 
on systems that don't provide the feature themselves.


Sure!  Yeah, that sounds like it would convey both the intention and the 
actual behavior.


Thanks for your time & consideration!

Carl

On Tue, 12 Jul 2016, Chet Ramey wrote:


On 7/11/16 6:02 PM, Carl Edquist wrote:


Bash Version: 4.3
Patch Level: 46
Release Status: release

Description:
The docs say:

Bash handles several filenames specially when they are
used in redirections, as described in the following table:

/dev/fd/[fd]
If [fd] is a valid integer, file descriptor [fd]
is duplicated.

But in fact this is only true when HAVE_DEV_FD is undefined (see
redir.c lines 522-524).

Of course, when HAVE_DEV_FD is defined and the /dev/fd mechanism
for opening files is available, the idea is _similar_ to fd
duplication, but in fact there are important differences in the
semantics:


These semantic differences do exist with the Linux implementation of
/dev/fd, but not with other implementations.  The BSD and Mac OS X
versions, for instance, behave more or less as if they duplicate the
file descriptor.  For instance,

https://github.com/freebsd/freebsd/blob/master/sys/fs/fdescfs/fdesc_vnops.c;
in particular fdesc_open(), which sets things up for dupfdopen(),
which duplicates a file descriptor.

I suppose it's worth saying something about how bash emulates the
files on systems that don't provide the feature themselves.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/





Re: Examples of concurrent coproc usage?

2024-04-09 Thread Carl Edquist
   local line fd
  while read -r line; do
for fd; do
  printf '%s\n' "$line" >&$fd;
done;
  done;
}

coproc UP   { stdbuf -oL tr a-z A-Z; }
coproc DOWN { stdbuf -oL tr A-Z a-z; }

# make user-managed backup copies of coproc fds
exec {up_r}<&${UP[0]} {up_w}>&${UP[1]}
exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]}

stdout=1
coproc THREEWAY { fdtee $stdout $up_w $down_w; }

# save these too, for safe keeping
exec {tee_r}<&${THREEWAY[0]} {tee_w}>&${THREEWAY[1]}


Then:  (actually works)

$ echo 'Greetings!' >&$tee_w
$ read -u $tee_r  plain
$ read -u $up_r   upped
$ read -u $down_r downed
$ echo "[$plain] [$upped] [$downed]"
[Greetings!] [GREETINGS!] [greetings!]


This is a pretty trivial example just to demonstrate the concept.  But 
once you have the freedom to play with it, you find more interesting, 
useful applications.


Of course, for the above technique to be generally useful, external 
commands need access to these user-managed fds (copied with exec).  (I 
have no idea why the coproc shell closes them.)  The shell is crippled 
when limited to builtins.


(I'll try to tidy up some working examples with my coprocess management 
library this week, for the curious.)



Juicy thread hey?  I can hardly keep up!  :)

Carl



Re: Examples of concurrent coproc usage?

2024-04-15 Thread Carl Edquist

On Mon, 15 Apr 2024, Zachary Santer wrote:

As much as you can just printf the same thing once for each fd, that 
doesn't work super well for binary data.


Well, you _can_ shovel binary data too: (*)

while IFS= read -rd '' X; do printf '%s\0' "$X"; done

and use that pattern to make a shell-only version of tee(1) (and I suppose 
paste(1)).  Binary data doesn't work if you're reading newline-terminated 
records, because you cannot store the NUL character in a shell variable. 
But you can delimit your records on NULs, and use printf to reproduce 
them.


But the shell is pretty slow when you ask it to shovel data around like 
this.  The 'read' builtin, for instance, cautiously does read(2) calls of 
a single byte at a time.  And printf will write(2) each null-terminated 
recored unbuffered.  Meanwhile cat(1) and tee(1) (which read and write 
larger blocks) are much faster.


But as I said before, the main work of the shell is making it easy to set 
up pipelines for other (very fast) programs to pass their data around.



[(*) It's slow compared to cat(1) or cp(1), but it does work.  I had do 
resort to this once in a botched glibc system upgrade that had wrecked my 
libc symlinks.  A broken libc symlink meant none of the external commands 
worked at all, so I had to fix everything with only the builtin commands 
of my existing live root shell.  I was able to copy all the (binary) .so 
files to another directory (/dev/shm) using the above read/printf loop, 
with the names fixed, then export LD_LIBRARY_PATH to point to the new 
copies.  That then made it possible to run external commands again (ln(1), 
in particular) to actually fix the symlinks.]



I've thought about splitting and recombining pipelines like this, but 
I've never had a reason to.


"If you build, it they will come."  ;)


Carl



Re: Examples of concurrent coproc usage?

2024-04-16 Thread Carl Edquist

On Tue, 16 Apr 2024, Andreas Schwab wrote:

But you can delimit your records on NULs, and use printf to reproduce 
them.


Though that will likely add a spurious null at EOF.



On Tue, 16 Apr 2024, Zachary Santer wrote:


Just wouldn't copy over whatever might have followed the final null
byte, if we're not talking about null-terminated data.


You guys are right.  Sorry for glossing over that detail.

Yes if the file does not end in a NUL byte, the last dangling record still 
needs to be printed. You can handle it either way with, for example:


while IFS= read -rd '' X; do printf '%s\0' "$X"; X=; done
[[ $X ]] && printf '%s' "$X"


Might've gotten lucky with all those .so files ending in a null byte for 
whatever reason.


Yes that is exactly what happened :)

Luckily, on linux anyway, .so files and ELF binaries always seem to end in 
a null byte.



There's no way to force this to give you the equivalent of sized 
buffers.  'read -N' obviously has the same problem of trying to store 
the null character in a variable. So, if you're trying to run this on a 
huge text file, you're going to end up trying to shove that entire file 
into a variable.


Right, that is another reason why it's really not a great solution.

Although you can limit the buffer size with, say, 'read -n 4096', and with 
a bit more handling[1] still get a perfect copy.  But that's not my point.


My point is, it's one thing to use it in an emergency, but I don't 
consider it a real usable replacement for cat/tee/paste in general use.


Shoveling data around should really be done by an appropriate external 
program.  So in my multi-coproc example, the shell is really crippled if 
the close-on-exec flags prevent external programs from accessing manual 
copies of other coproc fds.



Carl



[1] eg:

emergency_maxbuf_cat_monster () (
maxbuf=${1:-4096}
fmts=('%s' '%s\0')
while IFS= read -rd '' -n $maxbuf X; do
printf "${fmts[${#X} < maxbuf]}" "$X";
X=;
done
[[ ! $X ]] || printf '%s' "$X"
)




Re: Examples of concurrent coproc usage?

2024-04-20 Thread Carl Edquist

On Wed, 17 Apr 2024, Chet Ramey wrote:


On 4/16/24 2:46 AM, Carl Edquist wrote:

But the shell is pretty slow when you ask it to shovel data around like 
this.  The 'read' builtin, for instance, cautiously does read(2) calls 
of a single byte at a time.


It has to do it that way to find the delimiter on a non-seekable file 
descriptor, since it has to leave everything it didn't consume available 
on stdin.


Understood, and I wouldn't have it any other way.  It's entirely 
appropriate for reading relatively small amounts of data into shell 
variables.  I'm just saying for copying or filtering a substantial amount 
of data, it's 1000x better to use a suitable external command instead.


It makes me cringe a bit and sigh when I see people put something like

while read X; do echo "$X"; done

in a script, because they somehow imagine it to be more efficient than 
simply running "cat" .


Carl




Re: Examples of concurrent coproc usage?

2024-04-20 Thread Carl Edquist

On Wed, 17 Apr 2024, Chet Ramey wrote:


On 4/15/24 1:01 PM, Carl Edquist wrote:


 Yet another point brought to light by the bcalc example relates to the
 coproc pid variable.  The reset() function first closes the coproc
 pipe fds, then sleeps for a second to give the BC coproc some time to
 finish.

 An alternative might be to 'wait' for the coproc to finish (likely
 faster than sleeping for a second).


If the coproc has some problem and doesn't exit immediately, `wait' 
without options will hang. That's why I opted for the 
sleep/kill-as-insurance combo.


Yes that much was clear from the script itself.

I didn't mean any of that as a critique of the bcalc script.  I just meant 
it brought to light the point that the coproc pid variable is another 
thing in the current deallocate-on-terminate behavior, that needs to be 
copied before it can be used reliably.  (With the 'kill' or 'wait' 
builtins.)


Though I do suspect that the most common case with coprocs is that closing 
the shell's read and write fds to the coproc is enough to cause the coproc 
to finish promptly - as neither read attempts on its stdin nor write 
attempts on its stdout can block anymore.


I think this is _definitely_ true for the BC coproc in the bcalc example. 
But it's kind of a distraction to get hung up on that detail, because in 
the general case there may very well be other scenarios where it would be 
appropriate to, um, _nudge_ the coproc a bit with the kill command.



(And before you ask why I didn't use `wait -n', I wrote bcalc in 30 
minutes after someone asked me a question about doing floating point 
math with awk in a shell script, and it worked.)


It's fine!  It's just an example, after all  :)


Carl



Re: Examples of concurrent coproc usage?

2024-04-20 Thread Carl Edquist

On Wed, 17 Apr 2024, Chet Ramey wrote:

Yes, I agree that coprocs should survive being suspended. The most 
recent devel branch push has code to prevent the coproc being reaped if 
it's stopped and not terminated.


Oh, nice!  :)


Carl



Re: Examples of concurrent coproc usage?

2024-04-20 Thread Carl Edquist

On Thu, 18 Apr 2024, Martin D Kealey wrote:


On Wed, 17 Apr 2024, Chet Ramey wrote:

It has to do it that way to find the delimiter on a non-seekable file 
descriptor, since it has to leave everything it didn't consume 
available on stdin.


Has anyone tried asking any of the kernel teams (Linux, BSD, or other) 
to add a new system call such as readln() or readd()?


You mean, specifically in order to implement a slightly-more-efficient 
'read' builtin in the shell?



I envisage this working like stty cooked mode works on a tty, except it 
would also work on files, pipes, and sockets: you'd get back *at most* 
as many bytes as you ask for, but you may get fewer if a delimiter is 
found. The delimiter is consumed (and returned in the buffer), but 
everything following a delimiter is left available for a subsequent 
read.


One downside is you'd end up with a system call for each token, which is 
only a little bit better than the 'read' builtin read(2)'ing 1 byte at a 
time.  If your program / shell script is going to be processing a long 
stream of tokens, it's just going to be more efficient to read(2) a block 
at a time and do the tokenizing in userspace.  And with any luck you can 
find an appropriate command line utility to do that for you, rather than 
relying on the shell's 'read' builtin.


(Or for your own programs, use getline(3)/getdelim(3), as Chet mentioned.)

Carl



Re: Examples of concurrent coproc usage?

2024-04-20 Thread Carl Edquist
ion 
and never explicitly closed.  When the shell session exits, the fds are 
closed implicitly by the OS, and the coprocess sees EOF and exits on 
its own.]


That's one common model, yes. Another is that the shell process 
explicitly sends a close or shutdown command to the coproc, so 
termination is expected.


Right, but here also (after sending a quit command) the conclusion is the 
same as my point just below - that if the user is expecting the coproc to 
terminate, and expecting the current behavior that as a result the coproc 
variable will go away automatically, then that variable is as good as 
forgotten to the user.



If a user expects the coproc variable to go away automatically, that 
user won't be accessing a still-open fd from that variable for 
anything.


I'm more concerned about a pipe with unread data that would potentially 
cause problems. I suppose we just need more testing.


If I understand you right, you are talking about a scenario like 
this:


- a coproc writes to its output pipe
- the coproc terminates
- the shell leaves its fd for the read end of this pipe open
- there is unread data left sitting in this pipe
- [theoretical concern here]

Is that right?

I can't imagine this possibly leading to deadlock.  Either (1) the user 
has forgotten about this pipe, and never attempts to read from it, or (2) 
the user attempts to read from this pipe, returning some or all of the 
data, and possibly hitting EOF, but in any case DOES NOT BLOCK.


(I'm sorry if this is basically restating what I've already said earlier.)


That's more of a "my arm hurts when I do this" situation. If a script 
opened 500 fds using exec redirection, resource exhaustion would be 
their own responsibility.


Ha, good!

[I had a small fear that fd exhaustion might have been your actual 
concern.]



Meanwhile, the bash man page does not specify the shell's behavior for 
when a coproc terminates, so you might say there's room for 
interpretation and the new deferring behavior would not break any 
promises.


I could always enable it in the devel branch and see what happens with 
the folks who use that. It would be three years after any release when 
distros would put it into production anyway.


Oh, fun  :)


But since you mention it, writing to a broken pipe is still 
semantically meaningful also.  (I would even say valid.)  In the 
typical case it's expected behavior for a process to get killed when it 
attempts this and shell pipeline programming is designed with this in 
mind.


You'd be surprised at how often I get requests to put in an internal 
SIGPIPE handler to avoid problems/shell termination with builtins 
writing to closed pipes.


Ah, well, I get it though.  It _is_ a bit jarring to see your shell get 
blown away with something like this -


$ exec 9> >(typo)
$ ...
$ echo >&9  # Boom!


So it does not surprise me that you have some users puzzling over it.

But FWIW I do think it is the most consistent & correct behavior.

Plus, of course, the user can install their own shell handler code for 
that case, or downgrade the effect to a non-fatal error with


$ trap '' SIGPIPE


So even for write attempts, you introduce uncertain behavior by 
automatically closing the fds, when the normal, predictable, valid 
thing would be to die by SIGPIPE.


Again, you might be surprised at how many people view that as a bug in 
the shell.


I'm not terribly surprised, since at first (before reasoning about it) the 
behavior is admittedly alarming.  ("What happened to my terminal?!?!")


But I'd argue the alternative is worse, because then it's an unpredictable 
race between SIGPIPE (which they're complaining about) and EBADF.



I think we're talking about our different interpretations of `invalid' 
(EBADF as opposed to EPIPE/SIGPIPE).


Right - just explaining; I think by now we are on the same page.


My original intention for the coprocs (and Korn's from whence they came) 
was that the shell would be in the middle -- it's another way for the 
shell to do IPC.


And coprocesses are great for this, too!

It's just that external commands in a sense are extensions of the shell. 
The arms and legs, you might say, for doing the heavy lifting.



Carl



Re: Examples of concurrent coproc usage?

2024-04-27 Thread Carl Edquist



On Mon, 22 Apr 2024, Chet Ramey wrote:

You might be surprised. The OP was sending thousands of calculations to 
(I think) GNU bc, which had some resource consumption issue that 
resulted in it eventually hanging, unresponsive. The kill was the 
solution there. I imagine there are similar scenarios with other tools.


Ok, you got me!  I take it back.

I hadn't considered bc operations being cpu/memory intensive.  But that 
possibility makes sense - given that it's arbitrary precision I guess you 
can ask for a number to the billionth power and never see the end of it  :)


Carl



bash 3.2.0 broken ???

2008-05-21 Thread Carl-Valentin Schmitt
Hello folkz ?!

please check out if bash 3.2.0 is broken ???
as root after TAB-TAB only 2178 possibilities are shown ???

Should be about 3.300 possibilities, or not ???

Greetings.
Carl.


regex in bashexpression doesn't work

2012-06-25 Thread Niels Carl W. Hansen
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' 
-DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib  -D_GNU_SOURCE -DRECYCLES_PIDS  
-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv
uname output: Linux in 2.6.32-220.17.1.el6.x86_64 #1 SMP Wed May 16 00:01:37 
BST 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-redhat-linux-gnu

Bash Version: 4.1
Patch Level: 2
Release Status: release

Description:
[[ 'foobar'  =~ 'o.a' ]] && echo FOUND
should print  FOUND  but it doesn't.
This have worked in previeous versions of bash.
The bash used comes with CentOS 6.2

Repeat-By:
No privous commands necessary to generate this problem.

Fix:




Re: Examples of concurrent coproc usage?

2024-04-02 Thread Carl Edquist via Bug reports for the GNU Bourne Again SHell

On Mon, 1 Apr 2024, Chet Ramey wrote:


On 4/1/24 3:24 PM, Chet Ramey wrote:

 On 3/14/24 5:58 AM, Carl Edquist wrote:


 Well, *without multi-coproc support*, here's a simple wc example; first
 with a single coproc:

  $ coproc WC { wc; }
  $ exec {WC[1]}>&-
  $ read -u ${WC[0]} X
  $ echo $X
  0 0 0

 This works as expected.

 But if you try it with a second coproc (again, without multi-coproc
 support), the second coproc will inherit copies of the shell's read and
 write pipe fds to the first coproc, and the read will hang (as described
 above), as the first coproc doesn't see EOF:

  $ coproc WC { wc; }
  $ coproc CAT { cat; }
  $ exec {WC[1]}>&-
  $ read -u ${WC[0]} X

  # HANGS


 But, this can be observed even before attempting the read that hangs.


 Let's see if we can tackle these one at a time. This seems like it
 would be pretty easy to fix if a coproc closed the fds corresponding
 to an existing coproc in the child after the fork. That wouldn't
 really change anything regarding how scripts have to manually manage
 multiple coprocs, but it will prevent the shell from hanging.



I sent this before I was ready. This would be equivalent to changing the 
commands to use


coproc CAT { exec {WC[0]}<&- {WC[1]}>&- ; cat; }

but the script writer wouldn't have to manage it.



Agreed.

And just to note two things (in case it wasn't clear) - (1) the above 
example that hangs is with the default bash, compiled _without_ 
multi-coproc support; and (2):


This seems like it would be pretty easy to fix if a coproc closed the 
fds corresponding to an existing coproc in the child after the fork


the forked coproc has to close its fds to/from _all_ other existing 
coprocs (as there can be several).



Carl


Re: Examples of concurrent coproc usage?

2024-04-12 Thread Carl Edquist via Bug reports for the GNU Bourne Again SHell

On Mon, 8 Apr 2024, Chet Ramey wrote:


On 4/4/24 8:52 AM, Carl Edquist wrote:


 Zack illustrated basically the same point with his example:


 exec {fd}< <( some command )
 while IFS='' read -r line <&"${fd}"; do
   # do stuff
 done
 {fd}<&-


A process-substitution open to the shell like this is effectively a 
one-ended coproc (though not in the jobs list), and it behaves reliably 
here because the user can count on {fd} to remain open even after the 
child process terminates.


That exposes the fundamental difference. The procsub is essentially the 
same kind of object as a coproc, but it exposes the pipe endpoint(s) as 
filenames. The shell maintains open file descriptors to the child 
process whose input or output it exposes as a FIFO or a file in /dev/fd, 
since you have to have a reader and a writer. The shell closes the file 
descriptor and, if necessary, removes the FIFO when the command for 
which that was one of the word expansions (or a redirection) completes. 
coprocs are designed to be longer-lived, and not associated with a 
particular command or redirection.


But the important piece is that $fd is not the file descriptor the shell 
keeps open to the procsub -- it's a new file descriptor, dup'd from the 
original by the redirection. Since it was used with `exec', it persists 
until the script explicitly closes it. It doesn't matter when the shell 
reaps the procsub and closes the file descriptor(s) -- the copy in $fd 
remains until the script explicitly closes it. You might get read 
returning failure at some point, but the shell won't close $fd for you.


Since procsubs expand to filenames, even opening them is sufficient to 
give you a new file descriptor (with the usual caveats about how 
different OSs handle the /dev/fd device).


You can do this yourself with coprocs right now, with no changes to the 
shell.




 So, the user can determine when the coproc fds are no longer needed,
 whether that's when EOF is hit trying to read from the coproc, or
 whatever other condition.


Duplicating the file descriptor will do that for you.


Thanks for the explanation, that all makes sense.

One technical difference in my mind is that doing this with a procsub is 
reliably safe:


exec {fd}< <( some command )

since the expanded pathname (/dev/fd/N or the fifo alternative) will stay 
around for the duration of the exec command, so there is no concern about 
whether or not the dup redirection will succeed.


Where with a coproc

coproc X { potentially short lived command with output; }
exec {xr}<&${X[0]} {xw}>&${X[1]}

there is technically the possibility that the coproc can finish and be 
reaped before the exec command gets a chance to run and duplicate the fds.


But, I also get what you said, that your design intent with coprocs was 
for them to be longer-lived, so immediate termination was not a concern.



Personally I like the idea of 'closing' a coproc explicitly, but if 
it's a bother to add options to the coproc keyword, then I would say 
just let the user be responsible for closing the fds.  Once the coproc 
has terminated _and_ the coproc's fds are closed, then the coproc can 
be deallocated.


This is not backwards compatible. coprocs may be a little-used feature, 
but you're adding a burden on the shell programmer that wasn't there 
previously.


Ok, so, I'm trying to imagine a case where this would cause any problems 
or extra work for such an existing user.  Maybe you can provide an example 
from your own uses?  (Where it would cause trouble or require adding code 
if the coproc deallocation were deferred until the fds are closed 
explicitly.)


My first thought is that in the general case, the user doesn't really need 
to worry much about closing the fds for a terminated coproc anyway, as 
they will all be closed implicitly when the shell exits (either an 
interactive session or a script).


[This is a common model for using coprocs, by the way, where an auxiliary 
coprocess is left open for the lifetime of the shell session and never 
explicitly closed.  When the shell session exits, the fds are closed 
implicitly by the OS, and the coprocess sees EOF and exits on its own.]


If a user expects the coproc variable to go away automatically, that user 
won't be accessing a still-open fd from that variable for anything.


As for the forgotten-about half-closed pipe fds to the reaped coproc, I 
don't see how they could lead to deadlock, nor do I see how a shell 
programmer expecting the existing behavior would even attempt to access 
them at all, apart from programming error.


The only potential issue I can imagine is if a script (or a user at an 
interactive prompt) would start _so_ many of these longer-lived coprocs 
(more than 500??), one at a time in succession, in a single shell session, 
that

Re: Examples of concurrent coproc usage?

2024-04-15 Thread Carl Edquist via Bug reports for the GNU Bourne Again SHell
ds open across exec; though for now I will probably 
continue setting up pipes in the shell by methods other than the coproc 
keyword.



Cheers,

Carl


Re: Examples of concurrent coproc usage?

2024-04-21 Thread Carl Edquist via Bug reports for the GNU Bourne Again SHell

On Mon, 22 Apr 2024, Martin D Kealey wrote:


On Sun, 21 Apr 2024, 10:13 Carl Edquist,  wrote:

You mean, specifically in order to implement a slightly-more-efficient 
'read' builtin in the shell?


The read built-in in the shell is only one case that would benefit from 
such a syscall.


The purpose would be to allow multiple processes to read in turn from a 
consumable (or otherwise non seekable) input stream. In this context 
doing a large block read() is exactly what we DON'T want to do, so we 
also can't use a library function such as getline() that is built on top 
of such a read().


By way of example, another use would be the "head" utility, which by 
using such a syscall could consume only the bytes it outputs, leaving 
all other bytes still in the input stream. This would be an improvement 
over the current situation.


Basically any time you have cooperating processes reading delimited 
input, this would be an improvement.


Makes sense!



I envisage this working like stty cooked mode works on a tty,


…


One downside is you'd end up with a system call for each token



That's not how stty cooked mode normally works.

The typical use case is line-at-a-time, so this would reduce the number 
of system calls by about 90% on a typical text input stream, more if 
there are few or no blank lines.


However I would not hard code "newline" into the kernel, but rather 
allow the user code to nominate a list of delimiters.


When I say "token" I just mean a record with whatever delimiter you're 
using.  Assuming the reading stops after consuming the first delimiter 
(which is necessary for the 'read' builtin), then you end up with one 
system call per line or record or token or whatever you want to call it.



A line at a time is an improvement over a byte at a time, but if/when you 
can find a way to do what you want with the default block buffering & 
userspace tokenizing, you'll reduce the number of system calls by 99.99%. 
(Which is one reason I was saying the shell is crippled when limited to 
builtins; eg, a read/printf loop compared to simply running cat.)


...

But yeah currently a pipe with a series of records and multiple 
cooperating/competing readers perhaps only works if the records have a 
fixed size.  A new readd[elim] system call like you're talking about would 
allow safely reading a single variable-length record at a time.


So by all means, feel free to take the idea to your friendly neighborhood 
kernel community, if you'd like to pursue it  ;)



Carl


Re: Examples of concurrent coproc usage?

2024-04-27 Thread Carl Edquist via Bug reports for the GNU Bourne Again SHell

On Mon, 22 Apr 2024, Martin D Kealey wrote:


On Mon, 22 Apr 2024, 09:17 Carl Edquist,  wrote:

[…] I was saying the shell is crippled when limited to builtins; eg, a 
read/printf loop compared to simply running cat.


I would hope that mapfile/readarray could do better, since it's not 
obligated to leave anything in the input stream.


That is an interesting thought, although mapfile seems to read a byte at a 
time also.


[I'm not suggesting this should be changed though.  And frankly I have no 
strong desire for a faster 'read' builtin in the shell, either.  A byte at 
a time is relatively slow, but that's generally fine in most limited 
contexts where it's actually needed.]



But yeah currently a pipe with a series of records and multiple 
cooperating/competing readers perhaps only works if the records have a 
fixed size. A new readd[elim] system call like you're talking about 
would allow safely reading a single variable-length record at a time.


There are other options, such as length-prefixed records, or tagged 
(typed) records, but of course those aren't POSIX text files.


That'd work for "cooperating" readers (as you put it) where they are not 
attempting to read at the same time.  Though reading a single byte at a 
time also works in that case.


My thought is that (performance considerations aside), the real functional 
improvement with a new "readd" call would be with _competing_ readers 
(more than one read call waiting on the same pipe at the same time).


In that case a length-prefixed or type-tagged record wouldn't seem to work 
with the regular read(2), because a single reader would not be able to 
read the length/type _and_ the corresponding record together.  You can't 
work around this by reading a byte at a time either.  That's why I said it 
would only seem to work (with read(2)) if the records have a fixed size. 
(In order to grab a whole record atomically.)


But a new "readd" call would allow multiple competing readers to read, 
say, a stream of filenames from a pipe, without having to pad each one to 
PATH_MAX bytes.


It seems that if there is only one reader at a given time though 
("cooperating"), then it's just a matter of performance between 
read(2)'ing one byte at a time vs using a new readd call.


...

I'm not trying to advocate for or against you contacting the kernel folks 
with your idea; it just seems to me that the scenario with multiple 
competing readers might be the strongest argument for it.



Carl