Thanks for that thoughtful response. * I understand that the design decision is to have variable file descriptors to stay open after per-command redirection * I understand that implementation constraints make it impossible to do this uniformly (for external command redirection) * I understand that it is difficult for the script author to detect which case his code will be
I'm trying to make bash better and more usable. The shell normally does a great job of hiding the difference between internal and external commands, so even though it's very well documented, most of the time the user doesn't need to be aware. This is great for the user, and according to the principle of least surprise. The syntactic sugar of having bash select a free fd (which necessary for good composability of operations in complex script pipelines) is a great benefit, especially when mixing with older pipelines having fixed numeric fd. You say that there are technical reasons why the syntactic sugar of also keeping the fd open can't be implemented uniformly. I wonder if this puts unnecessary cognitive burden on the user, leading to reluctance to get the benefits, or to the introduction of latent bugs. There is a case I explain below which can lead to a leaked fd being held on to by subsequently invoked external processes. Of course it will technically be the users fault but I'm looking at reducing the cognitive burdens that make such a fault ultimately inevitable. The cognitive burdens of leaving the fd open are: 1. It breaks the normal expectation that per-command redirects are limited to the scope of the command. A naked exec already works to hold open a variable fd in a wider scope if that's what the scripter actually wants: exec {fd}>... ; 2. As syntactic sugar it moves, not removes, the boiler-plate burden This naked exec (see above) saved by the syntactic sugar in the case where the fd should remain open is offset by the naked exec now required in order to close the fd for the traditional case that the fd should not left open beyond the scope of the command. 3. The unmeetable cognitive burden is that in order to safely manage the previous two item, the user needs to know if the command will be external or internal or a function. This makes it hard for the user depend on this feature, because it is not possible to be sure at script author time whether a command is external. It may have become a function, (due to export -f, source, etc) which affect the execution environment. 4. The inevitable propagation of leaked fd's The knowing user can remember to always use an identity wrapper function to force treatment as external commands as internal functions in order to get uniform behaviour, and also explicitly close the fd afterwards. (I hope this doesn't break exec optimisations or signal propagation over a different process tree topology, though I doubt it. But other users may not know to close the fd which was never apparent (due invoking an external command) but which becomes an fd leak when they combine with other bash features (functions wrapping of external commands, or export -f environment that does this unawares) and those leaked fd's may then be inherited by other invoked external processes which may hold on to them for some time. This contrived example minimises the pipeline fd contortions in order to show that when what was an external command then becomes an internal command, it can as a consequence result in an fd leak to external processes (bash+lsof+grep here) which may be long lived. stty {x}>/tmp/log bash -c 'lsof -p $$ | grep log ; :' stty() { command stty $EXTRA_STTY "$@" } stty {x}>/tmp/log bash -c 'lsof -p $$ | grep log ; :' Leading to questions like: "Why does wrapping a one command in a function cause a different background process to hang on to a private handle not even used there?" The future: I recognise what you say about past design decisions, but for the future, as it is hard to safely get the benefit of leaving the handle open for variable per-command redefines, even for users who know about it, I wonder if the syntactic sugar might be redefined to reduce the cognitive burden and widen the benefit for the most valued variable fd's feature. If the variable fd syntactic sugar were re-designed so that variable handles were also limited to the scope the command, the same as for external commands, the same as for numeric handles, then: * the behaviour would be uniform, * the cognitive burden would be reduced * and there would be no behaviour dependent on the runtime environment (export -f to wrap external commands). * and no risk of unexpected or hard to control fd leaks to subsequent external (long lived) commands This would allow users to have full and safe benefit of bash-selected fd's, which I am sure is what is intended. I have done my best to be clear in a reasonable manner, but you are the man, it is your project, we stand or fall by your decisions, not mine. Sam On Wed, 24 Jul 2019 at 01:20, Chet Ramey <chet.ra...@case.edu> wrote: > On 7/23/19 5:15 PM, Sam Liddicott wrote: > > I'm very surprised that you continue to insist that it should be a > *design* > > decision that it should be hard for a script writer to be able to tell > if a > > handle will be left open or not. > > What? The design decision is that a file descriptor opened with {var} will > remain open after the command completes. > > > What could be the rationale for such a design decision? > > To make the redirection operator a little more useful than simple syntactic > sugar. > > > The vague justification you provide "there are plenty of things that > depend > > on whether or not a command is builtin, or whether it's run in the parent > > shell" is true but more relevant to an implementation constraint than a > > design decision. > > An implementation constraint? That doesn't make any sense. > > The bash documentation makes it pretty clear which commands are builtin and > the circumstances under which commands are run in child processes and which > are run by the shell itself. > > > I'm confident that most of these things you hint at are too *avoid* the > > scripter needing to be aware of the difference between internal and > > external commands. > > Bash doesn't make it particularly obscure about which commands are builtin, > and, as I said, the man page documents all of them. > > The builtin commands all provide functionality that can't be duplicated > outside the shell itself, even the builtins that duplicate external > commands (e.g., printf -v). Someone who writes shell scripts should be > aware of what's builtin and what's not. > > But that's not the problem here. > > > > > A design decision may well be to leave a variable handle open, but what > > *design* decision would add the proviso that it not be an external > command? > > This makes me believe that you have a fundamental misunderstanding about > how the shell operates. > > The design decision is to leave the file descriptor open, as I said above. > It's left open in all cases. The difference is that commands that are run > from the file system perform redirections in the child process, and child > processes cannot affect their parent's environment. That means, among other > things, that a file descriptor that a child process opens does not affect > the parent's descriptor set. That has nothing to do with the behavior of > {var} per se; it's a consequence of the relationship between Unix > processes. > > Chet > > -- > ``The lyf so short, the craft so long to lerne.'' - Chaucer > ``Ars longa, vita brevis'' - Hippocrates > Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/ >