Date: Sat, 20 Nov 2021 15:19:33 -0500 From: Chet Ramey <chet.ra...@case.edu> Message-ID: <c804ce20-5b65-14e2-9601-616abedae...@case.edu>
| Right. Purposeful. There's a difference between done intentionally for pragmatic reasons, and done intentionally because it is the right thing to do and people should depend upon it remaining that way. | How about this. You show me examples where bash (devel bash) does what you | think is the wrong thing, and we agree it's a bug, I'll fix it. I'll run our tests against the newest (released) bash (5.1.12(1)-release) [what does the (1) represent?? It always seems to be (1) in versions I see.] | The devel bash already does this. What the devel one does is unknown to me, I don't think I even have the means to obtain it (I have nothing at all git related, and no interest in changing that state of affairs). | > and a newline token in the middle of | > a command substitution counts for a here doc operator that occurred before | > it, | | What does `counts' mean? You're not really reading the lines as shell | words, "counts" means "is the one that matters" (ie: do not ignore this one). But, no, not this... | cat << EOF | echo $(echo this EOF is | not the end of | the command substitution | EOF | but it is the end of the | here-document | ) though that is a mildly interesting case, and I agree on how that gets parsed (the contents of the here doc are not examined until it is expanded when used for a redirection). That should result in a redirection error for cat, then (probably) "but: not found" (if the shell didn't already exit), "here-document: not found" and a syntax error on the ')'. (The "not found" errors are, naturally, assuming that commands of those names aren't found in a PATH search). What I meant was this one: cat <<EOF && grep $( foobar EOF echo barfoo) *.c where bash just sits at a PS2 prompt. Or this one cat $( cat <<FILES ) >/dev/null abc def FILES which doesn't get beyond the first line... jinx$ cat $( cat <<FILES ) >/dev/null bash: warning: here-document at line 13 delimited by end-of-file (wanted `FILES') bash: warning: here-document at line 1 delimited by end-of-file (wanted `FILES') 2 warnings?? There's just one heredoc redirection present. >From the line numbers, I assume the first is when scanning the outer cat command, and detecting its cmdsub arg, and the 2nd is from rescanning the command substitution. The first one clearly knows there is a heredoc, it also knows it is yet to encounter a newline token (or any newline in this example) hence the heredoc data cannot possibly be expected yet, it must wait until after that newline - eventually it gets past >/dev/null, finds the newline (token), and should start reading the heredoc text. At that point it looks to see where the << redirection occurred (the first on the line since this is the first heredoc read) and associates the data with that redirection operator. When the cmdsub is ready to be executed it finds the heredoc data already read and available. I never got to enter the lines starting "abc" ... (I could have, but I know I would have just seen 3 command not found errors, one for each line, so I didn't bother.) In both of those, the first newline token following the << operator (and its word) is the one at the end of the first line (of each). The heredoc data for each therefore starts on the 2nd line. What should happen: [jinx]{3}$ cat <<EOF && grep $( > foobar > EOF > echo barfoo) *.c foobar [jinx]{3}$ cat $( cat <<FILES ) >/dev/null > abc > def > FILES cat: abc: No such file or directory cat: def: No such file or directory For the first there are a couple of .c files in $PWD but they don't contain "barfoo", Neither "abc" nor "def" exist in $PWD | > and a here doc operator in a command substitution might not encounter | > a newline until after the cmdsub text has ended - the next following newline | > token provides there here doc text. | | I can't imagine a useful example of this that isn't an error. That's the 2nd example above, and a very normal thing to want to do, very short command substitutions (most of them) prefer to be complete within 1 line. Note that neither in POSIX, nor anywhere else, has there ever been any requirement on the heredoc data other than that it comes after the next newline (which should, we agree, be newline token, not newline character). Since heredocs are a lexical object, this processing is totally unaffected by whatever semantics the grammar is extracting from the tokens the lexer is returning to it, the grammar just increments the "number of heredocs needed" counter, supplies the end words for each, and the lexer takes care of the rest. And then there is of course the combination of the two of those examples: cat <<EOF && grep xyx $( cat <<END I'll stop it there, probably what follows is ')' on the same line, but whatever happens next (assumed syntactically corrrect), if your requirement is that END precedes EOF in what follows you're clearly wrong, as POSIX is quite clear that the order in which the heredocs are to be read is left to right across the line (regardless of which commands they're attached to), so the EOF ending one *must* appear first, and the END ending one second. And the two of them follow one newline token. kre ps: none of this stops people writing, if they prefer cat <<EOF && foobar EOF grep $( echo barfoo ) *.c but in that form it is much harder to see immediately what is the command that comes after the "&&" (particularly if the heredoc is a long one - perhaps hundreds of lines). It could of course be cat <<EOF && grep $( echo barfoo ) *.c foobar EOF which is probably what I'd use in a simple case like that where the grep all fits in the initial line, but if that gets ugly, for example if the cmdsub gets to be a long one, then the earlier form is nicer. Similarly cat $( cat <<FILES abc def FILES ) >/dev/null but that's just plain ugly (even if the > redirect is moved before the cmdsub).