Date: Wed, 2 Feb 2022 11:38:30 -0500 From: Chet Ramey <chet.ra...@case.edu> Message-ID: <7c422cbb-bba8-0a57-a565-eeb115120...@case.edu>
| > How accurately can you reconstitute? That is, can you maintain the | > difference between $(a b) and $( a b ) for example ? How about $(a b) ? | | Does it make a semantic difference? No, but that's not the point. | The only thing I can think of that | might mess it up is when you have something like | | alias l='(' r=')' | echo $(l 4+4 r) | | and you intend, for whatever perverse reason, to have this parsed as an | arithmetic expansion. I'm not sure that's worth supporting. Supporting that would be wrong. Aliases are only expanded in the command work position (which includes the 'l' there but never the 'r', it would need a newline before it, but you could add that it wouldn't affect the interpretation as arithmetic). But to be the command word position for either of them, the $( must be a command substitution, that decision must already have been made, and once it is, it is... It cannot turn back into arithmetic because the l happens to become '(', leaving two (( chars together, tokenisation does not go back and re-examine chars that were already examined. The $( was already found, the next ( is a whole new token. That is, you should never save the alias expanded input as literal text, to be combined with anything else, and scanned again later. That's simply wrong, expand aliases, the right side can combine with what follows, as that hasn't been read yet (beyond it ending the token which turns out to be the alias) but there can never be a joining to the left. So, no, definitely not worth "supporting", that can never be arithmetic. The case I had in question with the question about $( a b ) was this one... cat <<$( a b ) hello $( a b ) which works in bash 5.1.16, but doesn't in 5.2-(December 16). In the latter the end line needs to be $(a b). The spaces are lost - reconstituting really doesn't work for this purpose. (nb: no aliases involved anywhere here). Reconstituting is fine for -x tracing, and other similar purposes, but not here where the exact literal text is needed. For what it is worth, ksh93 (the version I have anyway) doesn't even get started... ksh93 $ cat <<$( a b ) /usr/pkg/bin/ksh93: syntax error at line 1: `<<b' here-document not contained within command substitution And I hate to even imagine what state it got itself into to produce that diagnostic. yash and zsh get this example correct, along with released bash. Nothing else I tested does - everything other than bosh (everything includes the NetBSD sh) generates an error of some kind on the here doc redirect. Bosh generates one at the first extra newline after the apparent end delimiter (ie: given the above input, it doesn't complain, just sits there waiting for more - type a \n at it, and then it complains about a syntax error. | You need to be able to reconstruct the text of arbitrary commands for | `set -x'. It's a very short step from that to rebuilding a command | substitution. There are some things that -x typically does not show anything like as input (in bash, try a case statement for example). What appears is fine for -x output, but not even close for reconstitution. This one works in released bash, but not in 5.2-xxx (and no white space tricks with this one to mess things up --- and I also did not try to guess what the end delimiter might actually work, if anything): cat <<$(case $x in a) echo found A;; b) echo found B;& *) echo found $x ;; esac) hello $(case $x in a) echo found A;; b) echo found B;& *) echo found $x ;; esac) (Doesn't matter what 'x' is for this, the "command substitution" is never actually executed - all that is simply text.) yash cannot parse that one properly, only released bash and zsh get this right, ksh93 failed in a similar way to above, didn't bother testing any of the others, none will work. Bash's -x output doesn't include redirections either, so unsurprisingly this one doesn't work in 5.2-xxx cat <<$(cat</dev/null) hello $(cat</dev/null) It does in zsh, released bash, and perhaps surprisingly, ksh93 (nothing else). Our shell does include redirections in -x output, but that output is post-expansion, and with all the redirects appended after the command, rather than whereever they appeared originally, so useless to reconstitute the original, but fine for -x purposes (even better than the original, as the rearrangement can expose bugs hidden by the way it was originally written). And no, we do not include here docs in -x output, just (with -x enabled, and PS4 initially ''): sh $ PS4=+ PS4=+ sh $ cat <<foo hello foo +cat <<... hello sh $ <<foo cat bye foo +cat <<... bye sh $ | You were in default mode, since you did not take the active step to run in | posix mode. OK, I wasn't sure what was being meant by default mode - wondered about possible shopt settings, or whatever... But yes, definitely not in posix mode. | Yes, since aliases get expanded while reading the WORD that is the here-doc | delimiter. That's what we're talking about changing here. Yes, that's what I was expecting to happen, after seeing Matijn's example, and why I asked about (and tested) that version (and why I was anticipating the alias expanded form to work as the end of the here-doc). If you abandoned reconstituting, and simply saved the string, however difficult that might be none of these issues would arise. It must be possible, the lexer is reading chars from somewhere, all it needs to happen is to be told when to start saving, and when to stop). This much I can easily make work in the NetBSD sh - my problem is where reasonably (and cheaply) to save those bytes - the "usual location" is building the parse tree, and cannot simply have random text written on top of it. This is so little used that an expensive solution isn't worth it (since it needs to happen for all here doc end words, just in case, but most of which are simple text and are already being saved, like any other command word would be). | It works in | previous versions of bash because those versions don't expand aliases while | (ad-hoc) parsing the command substitutions at all. It will work in default | (non-posix) mode versions going forward because I'm going to err on the | side of backwards compatibility, at least for a while. This one needs to work in posix mode as well, to be posix compat. There's very little point being posix compat in default mode, but not in posix mode, is there? The question here isn't about whether aliases are expanded in command substitutions, but that posix requires the exact text entered by the user to be the end delimiter. | But it's good to have the discussion about command substitution | parsing and when alias expansion happens nevertheless. Yes, implementors need to worry about obscure cases like this, even though no real users who ever encounter them, because there are Martijns around who keep finding stuff like this! kre