Re: wait inside subshell waits for sibling

Emanuele Torre Mon, 24 Oct 2022 10:46:17 -0700

On 24/10/2022, Robert Elz <k...@munnari.oz.au> wrote:
>   | The problem that was described is caused by that
>   | optimisation (not the version that is applied to simple commands that
>   | run external program, but to subshell compound command as mentioned).
>
> I understand that.
>
>   | In case you did not understand, the issue is not that "wait inside
>   | subshell waits for sibling", because the process that runs `cat' IS NOT
>   | a sibling of the process that runs `wait', it is a CHILD.
>
> I understand that, but you are missing the point.   This is where the
> "see below" that appeared earlier applies.   You're thinking of how it
> all exists at the unix process level.   And you're probably right (I don't
> know bash internals at all) that at that level the process substitution is
> a child of the shell environment that is running wait.


It does not make any sense that you keep insisting that this is about
bash internals, or low level system call (especially `wait', `fork',
etc), and I keep having to repeat that this is only caused by

    (CMDS) REDIRS

being sematically equivalent, *in bash*, to

    (exec REDIRS; CMDS)
    # N.B. this is actually exactly the same change in behaviour applied
    #      to simple commands that run external programs, but with
    #      subshell compound commands it makes sense to say that it is
    #      equivalent to (exec REDIRS; CMDS) since subshell compound
    #      commands themselves only expand redirections, and not regular
    #      words, and temporary environment assignment words.

which is totally intended.

>   | > That is an entirely different issue, and is working as is supposed
>   | > to work.   That it isn't what some people might expect is irrelevant.
>   |
>   | What do you mean?
>
> The ineffectiveness of var modifications in redirects is a direct
> result of how they work in general, they've always been (when a
> fork is required) evaluated in the subshell (though it is only barely
> that, just a forked process between the fork and the exec).

That is a modification of how the code works.

> But that's not the way we (or the shell) should be looking at this.  At
> shell script level, the command substitution is not a child of the
> subshell,
> and a wait command in the subshell environment should only see processes
> that were created inside that environment, at the shell script level.
>
> Any optimisation should be exactly that - retain the precise effects of
> the original code, but run faster (or smaller, or whatever).

I don't get why you accept that:

  'externalcmd' 'a' 'b' > "$(( ++i ))"

is not equivalent to:

  { 'externalcmd' 'a' 'b' ;} > "$(( ++i ))"

but can't accept that, *in bash*, this exact same change in behaviour
to subshell compound commands which.

This behaviour probably originated in ksh (irrelevant note: and was
probably implemented because applying redirection after fork() is easier
and safer, than having to dup() the original file descriptors, applying
the redirections, and finally having to restore the original file
descriptors with dup2() and close(), and the final behaviour is
equivalent "in most cases"), and bash copied it.

But note that that behaviour is even implemented by every sh; for
example, dash doesn't have it, and always runs redirections in the main
shell and then reverts them.

In bash, this same behaviour is also applied, to subshell compound
commands. Which makes sense (if you think that the behaviour for simple
commands that run external programs makes sense, which I don't
necessarily do personally), since they also have a "mandatory" fork()
that can be exploited to prevent the work needed to restore file
descriptors at the end, and is what inspired the simple command
behaviour in the first place.

> If an optimisation alters the way the code works, then it is broken, and
> needs to be fixed, or removed.
>
> That's as true of a shell as it is of a compiler, or anything else.

Perhaps you are mislead by the use of the word "optimisation" into
thinking that the shell is making an attempt at making the code not
change behaviour, it is not making any attempt at that, and it is
emulating an old (probably) ksh behaviour that allows it to do less work
achieving a similar behaviour most of the times.

    'external' > "$(( ++i ))"
is not equivalent to
    { 'external' ;} > "$(( ++i ))"

nor is
    'external' > "$(printf %s\\n "$PPID")"
equivalent to
    { 'external' ;} > "$(printf %s\\n "$PPID")"

or
    (builtin echo hi) > "$BASHPID"
to
    { (builtin echo hi) ;} > "$BASHPID"

>   | bash will always evalute  (CMDS) REDIRS  as  (exec REDIRS; CMDS).
>
> That's just fine, but it needs to avoid having anything in REDIRS
> affect the execution environment of the subshell.

I can agree that I would also like if bash expanded everything in the
main shell, and only applied the redirection of the expanded values in
the subshell.
That way there would be no change in behaviour, but retain the easier
handling of redirections. I am not sure why it is not implemented that
way, but probably simply because that is not how ksh does it. I would
not be opposed to this behaviour changing to that, since I do not like
that simple command behaviour to begin with, but I do not care much.

I do not agree at all, though, with your idea that variable
modifications *should* not be evaluated in the main shell, but
everything else should: all the expansions, and/or redirections,
regardless of what they are, should be evaluated by the same shell; if
they are not, it simply does not make sense.

>   | To show that this optimisation also affects subshell compound commands,
>
> Once again, examples showing variable modifications made in redirections
> are 100% irrelevant, and have no bearing on this.

I have used a variable modification in the example simply because it is
easier to show.

The change in behaviour is that all redirections (and expansions) are
evaluated after the fork, so I don't understand why you want to restrict
what is allowed to be effected by this change to just variable
modification, and not process spawned by > "$(cmd)" or < <(cmd), and,
perhaps, also not to expansions that get the current pid like
e.g. > "$BASHPID".

How doesn't that make any sense? And what justfies your reasoning since
also other shells that apply this change in behaviour (ksh) run

    external < <(cmd1) > "$(cmd2)"

with `cmd1' and `cmd2' as child processes of the subshell that ends up
exec()-ing `external'.

Are you just saying that because you are conviced that this is an issue
with the `wait' builtin's handling of process

> Let me give you an example
>
>       bash -c 'sleep 3 & (wait)'
>
> how long should that run before you get the next prompt back?
> 3 seconds, or as quickly as the commands involved can be started?
>
> Note that there is absolutely no reason to actually fork to run the
> wait command, it is the final command, once wait is done, its subshell
> exits, and the script exits - that subshell can be (and in many shells
> is) optimised away.   I don't know about bash.
>
> But any shell that takes 3 seconds to run that script is broken, as
> (wait) as a shell command is identical to (:) as the subshell cannot
> possibly have any children, and so that wait cannot possibly have
> anything to wait upon.
>
> That the background sleep might have happened to be started in the
> same shell environment as the wait command is run, such that its
> process is a child of the shell running the wait command is irrelevant.
>
> wait(2) would wait for it, wait(1) must not.   That's what I meant
> in the previous message about them not being the same thing.

This is all irrelevant, there is absolutely no problem with the `wait'
builtin, the handling of jobs, or any underlying syscall that is
affecting the "different" behaviour of the command in question.

Your example does not show any redirection that would be affected by the
behaviour in question and implies that the "problem" is equivalent to if
`(wait)' incorrectly waited for `sleep 3', or any process "That the
shell happens to have forked other processes for its own reasons" in
that example which is absolutely not the case.

The only reason why

    (: & wait) > >(cat)

behaves like it does *in bash* is that, *in bash*, it is semantically
equivalent to:

    (
        exec > >(cat)
        : &
        wait
    )

and you are surprised because you didn't know at least of that behaviour
(and, maybe, also that process substitutions processes count as
background jobs).

Just like one would be surprised to see that

    i=0
    while [ "$i" -lt 10 ]; do
        sleep 10
        program get_currentstate > "/tmp/file$(( ++i ))"
    done

is an infinite loop, if they did not know of that simple command quirk.

>   | (CMDS) REDIRS  =>  (exec REDIRS; CMDS)  is arguably an incorrect
>   | optimisation (afaik, only bash peforms it; ksh93 performs the simple
>   | command optimisation, but not this one),
>
> I would agree, but the optimisation could be fixed, it doesn't need
> to be removed.

>   | but that is what is causing the issue,
>
> I don't care what is causing the issue, that's not my problem, my
> only concern is with correct behaviour.

That is the correct, intended, behaviour. The fact that we both don't
like it is another question.

>   | This is not a problem with the `wait' builtin waiting for a sibling
>   | since `cat' is not its sibling;
>
> It is.   You cannot optimise away a relationship that exists.  Anything
> that's doing that is reinterpreting the code to mean something different
> than what was written, and consequently is broken, and should be fixed.

Again, that is the exact same for simple commands.

    bash-5.1$ { pstree -p -- "$$" ;} > >(cat)
    bash(1774906)-+-bash(1775048)---cat(1775050)
                  `-pstree(1775049)
    bash-5.1$ pstree -p -- "$$" > >(cat)
    bash(1774906)---pstree(1775092)---bash(1775093)---cat(1775094)

    bash-5.2$ { pstree -p -- "$$"; } > >(cat)
    bash(1811496)-+-cat(1811526)
                  `-pstree(1811527)
    bash-5.2$ pstree -p -- "$$" > >(cat)
    bash(1811496)---pstree(1811534)---cat(1811535)

And it is simply not correct to think that the subshell running `cat'
should be a sibling of the subshell running CMDS (in the command below)
if you know how that is supposed to work *in bash*.

    (CMDS) > >(cat)

If we want to discuss changing the behaviour of this "optimisation" to
make it an actual optimisation that doesn't change the behaviour of the
code. (e.g. by making it evaluate all the expansions in the main shell,
and only the redirections of the expanded values in the subshell as I
proposed earlier.), let's please not pretend that bash is not working as
intended, and let's not say and repeat over and over nonsense like that
there is an issue with the handling of background jobs, or system calls;
with the `wait' builtin in general; saying that the "optimisation" works
just fine for simple commands, but not for subshell compound commands
(they are the exact same "optimisation", and they have the exact same
problems!); nonsensical justifications that variable modifications
should not be evaluated in the main shell, but everything else should;
etc.

emanuele6

Re: wait inside subshell waits for sibling

Reply via email to