Date: Wed, 17 Nov 2021 15:47:37 -0500 From: Chet Ramey <chet.ra...@case.edu> Message-ID: <420281e7-f3c4-8054-d390-9378080c2...@case.edu>
| Every modern shell uses `$PATH' as the here-document delimiter Depends what you call modern shells - some ash derived shells (at least) don't, because they parse the $PATH into an internal form (in all words where that makes sense, before knowing what the word is to be used for) and then cannot match that properly. While that isn't actually expanding the word, it still makes things fail badly. But: [D] sh-current $ cat foo <<$PATH sh: 80: Syntax error: Illegal eof marker for << redirection at least we error out when the user tries, not just fail to ever find the end of the here doc. | and checks for the delimiter before any part of the process that expands | the lines in the here-document body. That yes, I agree, everyone does that. | > First, the EOF should not work, that's a bash bug (IMO) - that should | > generate an error, not just a warning. | | It's not. The historical shells used for the basis of the POSIX standard I didn't say it was a standards violation, I said it was a bug. That the same bug exists in some other ancient shells isn't a justification. Blindly taking the whole remainder of the script as a here document, and processing it as if that were the author's intent, just because they made a typo somewhere, is simply irrational. Further, no-one (not anyone I have ever seen) deliberately relies upon the here doc ending at EOF, not even if a here doc is in a -c command string or similar). | Bash at least warns you about it. Yes, better than some, but not as good as it should be. | > OK, here we have another of the oddities of shell syntax. The spec | > says that a here document starts at the next newline after the << operator, | > but that's not what it really means. | | I think the intent there is that the here document starts at the next | newline after the delimiter. You mean at the newline after the ola" in the example given? Really? Surely it must mean newline token, not newline character, mustn't it? (Even then, there are more, messier, issues, which I know you're aware of; if we could make it as simple as "after the lexically next newline token" it would make everything much simpler - that's what it should be.) | > Being able to do that (include embedded newline characters | > do in some other shells). | | I couldn't fine one where it does. They work in (at least) the NetBSD shell, FreeBSD too I expect, since the two use essentially the same mechanism for recognising the end of the here doc -- (effectively) after a newline, read chars (from a buffer) one at a time, comparing them with the end delimiter, until either there is a match failure, or until the end of the end delimiter (after which one more char from the buffer is compared to \n). (Add tab stripping as required). On no match, reset the buffer pointer back to where all this started, and continue reading lines into the here doc. When the end delim is recognised, the here doc is complete after the last \n that was added to it, and regular shell input continues after the \n from the buffer which matched after the end delimiter. What the chars are that match (including more newlines, etc) is irrelevant, anything works (but no tab stripping occurs after any intermediate newlines). | > Since bash doesn't allow end delimiter words that contain newlines to | > work, it should probably generate an error when you try to use one, that | > would have made things clear. | | See above. Again, behaving irrationally when it would be trivial to detect the error (even if a rare one) is poor design, and should be fixed. kre