Re: "here strings" and tmpfiles

Robert Elz Mon, 08 Apr 2019 22:54:13 -0700

    Date:        Mon, 8 Apr 2019 22:03:25 -0700
    From:        pepa65 <solu...@passchier.net>
    Message-ID:  <cf504d03-3531-316b-c91f-6c58c05c5...@passchier.net>


  | What is nice about here docs/strings is that there are no subshells
  | involved.

When they use files, that's correct, but when a pipe is used, unless
the data size is both known in advance, and limited, a sub-shell is
needed just to write the here doc text through the pipe, so it is
able to stall when needed without affecting anything else.

That's why Chet suggested the possibility of using a pipe for small
here docs, and a file for big ones (though to me, a pipe for all makes
more sense - the difference between small and large is whether the
shell simply writes into the kernel pipe buffer, and then exec's the
process to read it (or reads itself for a builtin) or whether it
forks a subshell whose job it is is to merely feed the here doc data
into the pipe (allowing stalls when the pipe buffer fills).

[Aside: when the destination is a builtin, another strategy is to
 simply write the here doc into mem, and have the builtin read directly
 from the mem buffer - no actual I/O of any kind involved at all.]

Using a sub-shell also helps with the posix semnatics which require
that the redirects be evaluated in a sub-shell context (generally that
of the command they're being used with).   There are ways to fake that,
but the simple way is much easier, and more reliable.

  | I think Linda's main drive is to seek improvement in how bash works.

That's a fine objective - but remember that the shells primary purpose
is to run other commands (interactively or via a script) - the real
work should normally be done by the other commands, not by the shell.
So, optimising how commands get located, exec'd, ... is all a great idea
(avoiding forking wherever possible) but spending lots of time and
adding all kinds of trash to allow complete programs to just be written
as sh script is probably the wrong approach - there are much better
programming languages around for general purpose programming than sh.
(Which to use depends upon the nature of the program.)

Also, in my earlier e-mail I gave this example:

  |     printf %s\\n hello | { sleep 5; cat; } 

I realised after that a better test case is ...

   { date +1:%c; date +2:%c;} | { sleep 20; date +3:%c; cat;}; date +4:%c

when I ran it (using bash, if it matters, it shouldn't)

3:Tue Apr  9 11:54:20 2019
1:Tue Apr  9 11:54:00 2019
2:Tue Apr  9 11:54:00 2019
4:Tue Apr  9 11:54:20 2019

(and yes, I contrived to start the sequence just as the time reached the
start of a minute, so that is real output, but required me to hit "return"
at just the right time to generate it...)

In that the '3' line (the first) shows the time just before cat
starts, the '1' and '2' lines are output from cat, through the pipe.
The point of the second one (the '2' line) is to show that the fist
date command did not stall waiting for the cat to start - it wrote
its output and exited, allowing the 2nd date command to run at the
same apparent time - 20 seconds before cat started reading anything.
That second date command finished just as quickly.   The write side
of the pipe would be closed after that, with the sleep still running.
Then date '3', the cat which reads the output from date 1 & 2, and
a final date ('4') just to round things off.   Reading from a pipe
after the writer has finished works just fine...

kre

Re: "here strings" and tmpfiles

Reply via email to