Re: Unclosed quotes on heredoc mode

Chet Ramey Sat, 20 Nov 2021 08:33:55 -0800

On 11/19/21 9:18 AM, Robert Elz wrote:

illusory compat issues.  I have no idea what inspired this initially, but
my guess would be a code bug no-one noticed.


I'm skeptical, but willing to be convinced. Bourne's shell allowed EOF to
terminate all sorts of things (quoted strings, command substitutions, here
documents) -- enough to make it purposeful.

So just how many complaints do you get about the warning message?
"ksh doesn't complain wbout this, why does bash?"


It's usually people who have misplaced or mistyped the ending delimiter.
It took only a few seconds to find this:

https://unix.stackexchange.com/questions/657488/warning-here-document-at-line-2-delimited-by-end-of-file-wanted-eof

I don't have time right now to look for other reports that might have
tested it against other shells.

   | Which instance of `ola"'? The first or the second?

The first.

   | This cannot be a serious question unless you mean the second.

It is a very serious question, but not as to what should hppen
but how the standard needs to describe it.


That's why I suggested what I did.

Some variant of the existing

"When an io_here token has been recognized by the grammar (see Shell
Grammar), one or more of the subsequent lines immediately following the
next NEWLINE token form the body of one or more here-documents and shall be
parsed according to the rules of Here-Document."

could probably work as a basis. That implies that the shell goes off and
reads lines before parsing the rest of the current line as a list.


   | The delimiter is a `word', and we both know what a shell word is.

yes, but that's irrelevant, it is merely a coincidence here that
the newline in question occurs in the delimiter.
Another example
        cat <<EOF > file ; echo "abc
        def
        EOF
        ghi" \
        EOF
        EOF
What is the here doc, and what does echo say.


That's a good example. The here-doc is empty (the delimiter is the third
EOF) and the echo prints the rest of the text, with the backslash-newline
disappearing.

I'd say that this is somewhat deceptive, and is a decent illustration of my
point. The shell -- bash, at least -- always reads complete lines from the
input before parsing any here documents, so it's going to keep reading
through the second EOF to read the `complete' first line, due to the quoted
string and the quoted newline. The `current' token is going to be the
newline that follows the second EOF even before it starts figuring out that
it has a here-document and goes off to collect the body.

So, the shell reads the here-document body and creates the here document
after it reads an unquoted newline token -- the first newline token after
finding the here-document delimiter.


The first newline after the << is the one after abc.
Do remember that here doc data collection is entirely a
lexical issue, that's why tgey dot appear anywhere in
the sh grammar.


Oh, I do.

   | The newline after the delimiter is both, but sure, newline token would
   | probably work better.

The example above shows the issue better.  That includes the \newline
which can only be a \ newline because the 2nd char there is a newline,
and that has to be seen at the lexical level.


Yes. Here-documents are one of those features that requires mutual feedback
between the parser and lexer.


   | So it doesn't read `lines' in the POSIX sense? Huh. Who knew?

For this.  No.   An extension.  One that comes for feee.


I like the Freudian slip there.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

Re: Unclosed quotes on heredoc mode

Reply via email to