a small comment on that /bin in PATH code.. is invalid, you need to match first non : beginning ahe not : ending end case :$PATH: would fix it
On Sun, Nov 28, 2021, 20:31 Robert Elz <k...@munnari.oz.au> wrote: > Date: Sat, 27 Nov 2021 13:57:57 -0500 > From: Chet Ramey <chet.ra...@case.edu> > Message-ID: <5217c48e-c989-a163-5673-38995e35a...@case.edu> > > Warning: long message follows, give yourself time to digest it. > > | OK, if you do end up building the devel branch, I'd be interested > | in these results. > > Assuming that happens, I shall certainly let you know. > > | > Once, of course ... why would I ever build it again? > | > | Patches exist. There are vendors who take the original release, apply > their > | own special-sauce patches, then apply the patches I release as they > come > | out, as part of their own distribution release process. > > Of course, NetBSD pkgsrc (used on other systems as well) does that too. > But your patches appear about every 5-6 months, so I end up doing one > build every 5-6 months. Keeping the object files (even the unpacked > sources) sitting around waiting for the next patches, in order to save > perhaps 2-3 minutes of build time isn't worth the bother. Once built > and installed it all gets trashed. > [I have also contemplated doing builds in an MFS (or tmpfs) > which would vanish on a reboot (or just umount) and I do tend > to reboot more often than bash patches are released ... but I've > yet to actually do that, for bash, the build time saved > wouldn't > be worth the bother - for some other apps, it might be]. > > pkgsrc doesn't encourage attempting to retain anything in any case - it > probably isn't a problem for bash (at least I've never see it, not that > I ever looked either) but other applications have a habit of deleting files > from their distributions - and unless one starts from an empty directory, > unpacking a tarball doesn't cause those files to be removed ... further, > some build systems don't pay attention to what is supposed to be there, > and manage to link all the .o files they can find. > > It is easier, and more reliable, to simply start clean every time. > > But of course that doesn't apply when you're developing and building > several times a day (or sometimes, dozens of times an hour). That just > doesn't apply to me with bash. > > | Usually, that's ok. In this instance, where we're discussing a feature > | whose implementation is substantially different between the released > and > | development versions, it's more relevant. > > Sure, though I didn't know this part was changed so much in the > devel version until you told me just recently (I do not watch what happens > there). > > | So the ultimate question is whether or not the act of reading a command > | substitution should reset this requirement. That's where we disagree. > | The grammar is, at that point, reading a different command. > > "command" is a loaded word in sh terminology, it is used for all kinds of > things, but in general it is not at all unusual for here document text to > appear while a command other than the one with the redirection operator is > being processed (no command substitutions necessarily involved). What the > grammar is doing after a here doc redirection operator has been processed, > until the next newline (token) is encountered is irrelevant - the spec > imposes no requirements upon that at all. > > > | > Then we get to whether heredoc data is part of a valid shell script > | > in that sense - when there is yet to be a newline token to introduce > it. > | > | What does this mean? In all cases, the here-documents are not read > until > | after a newline token. That's not the issue. > > Sure, but that's not what I meant. I treat heredoc data as much the same > as a \newline - something that the lexer deals with, and the grammar never > knows happened. Heredoc data doesn't appear at all in the sh grammar, > as nothing in the grammar cares in the slightest about them (once they're > queued). What I meant was that from that perspective, whether a sh script > (or sh script fragment) is valid or not, is determined by the grammar, and > given that here doc data does not appear there, it cannot have any impact > upon the decision whether some particular part of the sh input is valid or > not. Of course, if the script ends (completely) without a newline token > after the last redirect operator then that's an error - but of a subtly > different kind (more like an unterminated string (mismatched quotes) or > here doc data without its required terminating word -- all lexical > constructs). > > So, if one does > > $( cmd <<END ) > > there's nothing invalid about that, unless EOF follows that ')' before > a newline token appears. And if that happens, it isn't the grammar that > complains, but something beyond that. The syntax "word redirect" is > perfectly valid, and "<< word" is a perfectly valid redirect. The data > doesn't need to appear there, if no newline has yet appeared, any more > than it does in > > cmd << EOF ; ... > > where the data doesn't need to appear there, when a newline has not yet > appeared. > > You seem to be hung up on the way you have chosen to implement $( ) > (which of itself is OK, but it is not required to be done that way) > where (it seems) you parse the command inside the $() as if there was no > world at all outside it. As far as getting the grammar correct that's > fine, but it doesn't work with here doc data. > > > | > | The netbsd shell appears to be the outlier here. The parser > reads the > | > | command substitution so it can parse the entire and-or list > before trying > | > | to gather any here-documents. > | > > | > You cannot possibly really mean that I hope. That is, in > | > > | > cmd1 <<EOF && > | > data > | > EOF > | > cmd2 > | > > | > you do agree that "data" is stdin to cmd1, that is, the herdoc data > | > appears splat in the middle of the and-or list. That's certainly > the > | > way it appears to work (in bash) to me. > | > | There is no command substitution in this example. > > I know. But go back and read the quote from you (still here, above, in > this message) again: "The parser reads the command substitution so it can > parse the entire and-or list before trying to gather any here-documents" > > ** parse the entire and-or list before trying to gather any here documents > ** > > I don't believe that you really meant that, it isn't the way bash behaves > (unless this is something different in the devel version, but I doubt that) > and I was just pointing out that poor phraseology. > > | So, again, the question is whether or not input data that is logically > | part of the command substitution (it appears between the opening and > | closing parentheses) should affect the `outer' command. That's the > | question. We have different answers. > > We do, because I don't view here doc data as affecting anything except the > command for which it is input. As far as the script goes, it is just a > rather weird method (kind of like the original implementation) of creating > an anonymous file and then passing that file as input (usually stdin, but > not required to be) to a command. > > Consider this alternative, which is (one possibility for) what would be > needed if here-docs did not exist: > > printf '%s\n' 'data' >/tmp/hidden.data.$$ > cmd </tmp/hidden.data.$$ > rm /tmp/hidden.data.$$ > > whereas with here-docs, we do instead > > cmd <<'END' > data > END > > That's all fine, and either of those would (more or less) work > with any shell. > > Now consider instead that cmd is to be run in a command substitution. > > One can certainly do > > ... $( > printf "%s\n" 'data' >/tmp/hidden.data.$$ > cmd </tmp/hidden.data.$$ > rm /tmp/hidden.data.$$ > ) ... > > which is the rough equivalent of > > ... $( cmd <<END > data > END > ) ... > > and that should work. No question. > > But one can also do > > printf "%s\n" 'data' >/tmp/hidden.data.$$ > .... $( cmd </tmp/hidden.data.$$ ) ... > rm rm /tmp/hidden.data.$$ > > and that would also work everywhere, right? That is, the data for the > command in the command substitution is created (and removed, but that bit > of it is generally irrelevant here) outside the command substitution. > > This is the rough equivalent of > > ... $( cmd << \END ) ... > data > END > > And then once you allow that to work (which you're apparently now doing > in the devel version), there cannot really be any objection to > > cmd <<END $( cmd1 && > data > END > cmd2 ) > > as that's really just the same principle being applied in the other > direction. Furthermore that means that in > > cmd <<END1 $( cmd1 <<END2 && > > (with a newline after the "&&") the data that follows is > > data1 > END1 > data2 > END2 > > keeping the left to right across the input line is the order > that the standard requires here document data to appear in. > > Here "input line" is really a logical line, rather than a physical > one. as we have already agreed that here docs don't appear in the > middle of quoted strings, and nor do they appear after elided newlines > (\newline pairs) which are removed, neither of which generates a newline > token. But it is "line" not "command", or anything else related to the > grammar which is specified: > > The redirection operators "<<" and "<<-" both allow redirection > of subsequent lines > > "subsequent lines" ie: "lines after the current line" > > If more than one "<<" or "<<-" operator is specified on a line, > the here-document associated with the first operator shall be > supplied first by the application and shall be read first by the > shell. > > Note: "line", not grammatical command, or script, or and-or list, or > anything related to the grammar at all. (The grammar generally ignores > lines, a newline token is almost just a ';' - except we're allowed as > many newlines as we like, where just one ';' (sometimes none) is > permitted). > > Another example (no cmdsubs again) that is kind of weird, and unlikely, > but should be permitted, and should work: > > cat << END; case $PATH > data > END > in > *:/bin:*) echo /bin is in PATH! ;; > esac > > Bash (5.1.xx) allows that, so does everything else (aside from some old, > and not even all that old, ash derived shells which had a bug not relevant > here). The heredoc data for cat appears splat in the middle of the > unrelated case statement. No problems, it all works, as it should - but > probably would not if here-doc data was something known to the grammar. > But it isn't, the lexer removes it, as far as the grammar & its parser are > concerned the "data" and "END" lines are not there at all. > > kre > > >