Re: Unclosed quotes on heredoc mode

Robert Elz Wed, 17 Nov 2021 16:02:12 -0800

    Date:        Wed, 17 Nov 2021 15:47:37 -0500
    From:        Chet Ramey <chet.ra...@case.edu>
    Message-ID:  <420281e7-f3c4-8054-d390-9378080c2...@case.edu>


  | Every modern shell uses `$PATH' as the here-document delimiter

Depends what you call modern shells - some ash derived shells (at least)
don't, because they parse the $PATH into an internal form (in all words
where that makes sense, before knowing what the word is to be used for)
and then cannot match that properly.   While that isn't actually expanding
the word, it still makes things fail badly.

But:

[D] sh-current $ cat foo <<$PATH
sh: 80: Syntax error: Illegal eof marker for << redirection

at least we error out when the user tries, not just fail to ever
find the end of the here doc.


  | and checks for the delimiter before any part of the process that expands
  | the lines in the here-document body.

That yes, I agree, everyone does that.

  | > First, the EOF should not work, that's a bash bug (IMO) - that should
  | > generate an error, not just a warning.
  |
  | It's not. The historical shells used for the basis of the POSIX standard

I didn't say it was a standards violation, I said it was a bug.
That the same bug exists in some other ancient shells isn't a justification.

Blindly taking the whole remainder of the script as a here document, and
processing it as if that were the author's intent, just because they made
a typo somewhere, is simply irrational.   Further, no-one (not anyone I
have ever seen) deliberately relies upon the here doc ending at EOF, not
even if a here doc is in a -c command string or similar).

  | Bash at least warns you about it.

Yes, better than some, but not as good as it should be.

  | > OK, here we have another of the oddities of shell syntax.   The spec
  | > says that a here document starts at the next newline after the << 
operator,
  | > but that's not what it really means. 
  |
  | I think the intent there is that the here document starts at the next
  | newline after the delimiter.

You mean at the newline after the ola" in the example given?   Really?
Surely it must mean newline token, not newline character, mustn't it?
(Even then, there are more, messier, issues, which I know you're aware of;
if we could make it as simple as "after the lexically next newline token"
it would make everything much simpler - that's what it should be.)

  | > Being able to do that (include embedded newline characters
  | > do in some other shells).
  |
  | I couldn't fine one where it does.

They work in (at least) the NetBSD shell, FreeBSD too I expect, since the
two use essentially the same mechanism for recognising the end of the
here doc -- (effectively) after a newline, read chars (from a buffer) one
at a time, comparing them with the end delimiter, until either there is a
match failure, or until the end of the end delimiter (after which one more
char from the buffer is compared to \n).   (Add tab stripping as required).

On no match, reset the buffer pointer back to where all this started,
and continue reading lines into the here doc.  When the end delim is
recognised, the here doc is complete after the last \n that was added to
it, and regular shell input continues after the \n from the buffer which
matched after the end delimiter.   What the chars are that match (including
more newlines, etc) is irrelevant, anything works (but no tab stripping
occurs after any intermediate newlines).

  | > Since bash doesn't allow end delimiter words that contain newlines to
  | > work, it should probably generate an error when you try to use one, that
  | > would have made things clear.
  |
  | See above.

Again, behaving irrationally when it would be trivial to detect the error
(even if a rare one) is poor design, and should be fixed.

kre

Re: Unclosed quotes on heredoc mode

Reply via email to