Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Chet Ramey

On 1/31/22 11:40 PM, Martijn Dekker wrote:

On the latest code from the devel branch:
GNU bash, versie 5.2.0(35)-alpha (x86_64-apple-darwin18.7.0)

Reproducer script:

shopt -s expand_aliases
alias let='let --'
set -x
let '1 == 1'
: $(let '1 == 1')

Output:

+ let -- '1 == 1'
++ let -- -- '1 == 1'
foo: line 5: let: --: syntax error: operand expected (error token is "-")
+ :

The alias is incorrectly expanded in the command substitution, duplicating 
the '--' argument and causing a syntax error.


Thanks for the report. This is a consequence of parsing the contents of the
command substitution `early', at parse time, then doing it again at
execution time. It's clearly wrong to expand aliases during both passes.

But which is the `right' time? If you expand on the first pass, you get
constructs like this failing to find command `a':

[ -n "$BASH_VERSION" ] && shopt -s expand_aliases

foo()
{
var=$(a b)
echo $var
}

alias a=echo
foo

If you defer alias expansion until execution, you lose the (posix-
encouraged but officially unspecified according to the approved
interpretation of issue 1342) ability to have aliases affect command
parsing in the command substitution:

[ -n "$BASH_VERSION" ] && shopt -s expand_aliases

alias nest='('

echo $( nest echo ok 1 ) )
echo "$( nest echo ok 2 ) )"

Can't have both.

Historically, bash (and ksh93) has favored the former. Just about all the
other shells claiming some sort of POSIX conformance favor the latter (all
the ash-based shells, yash, mksh).

What are your plans here?

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Chet Ramey

On 2/1/22 10:23 AM, Chet Ramey wrote:


If you defer alias expansion until execution, you lose the (posix-
encouraged but officially unspecified according to the approved
interpretation of issue 1342) ability to have aliases affect command
parsing in the command substitution:


Well, I went back and read the entire interpretation. The part that's not
specified is whether an alias expansion provides the closing `)', but alias
expansion has to be performed while parsing the contents of the command
substitution:

"existing aliases are required to be expanded when the shell parses the
input that follows the "$(" in order to find the terminating ')'"

and (in the same interpretation):

"Historically some shells used simple parenthesis counting to find the
terminating ')' and therefore did not account for aliases. However, such
shells never conformed to POSIX, which has always required recursive
parsing (see XCU 2.3 item 5)."

So this seems like behavior that should be conditional on posix mode to
preserve backwards compatibility.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Alex fxmbsw7 Ratchev
you'd expand an alias if seen and then reinterpret the whats gotten also to
( possible ) current cmdline before alias

its hard to limit this to one cmd per chunk only

you, 1. expand alias 2. it expanded to multiple complex cmds 3. bash parses
the resulting text and does its per cmd recieved works

On Tue, Feb 1, 2022, 19:22 Alex fxmbsw7 Ratchev  wrote:

>
>
> On Tue, Feb 1, 2022, 19:16 Alex fxmbsw7 Ratchev  wrote:
>
>>
>>
>> On Tue, Feb 1, 2022, 19:11 Chet Ramey  wrote:
>>
>>> On 2/1/22 10:23 AM, Chet Ramey wrote:
>>>
>>> > If you defer alias expansion until execution, you lose the (posix-
>>> > encouraged but officially unspecified according to the approved
>>> > interpretation of issue 1342) ability to have aliases affect command
>>> > parsing in the command substitution:
>>>
>>> Well, I went back and read the entire interpretation. The part that's not
>>> specified is whether an alias expansion provides the closing `)', but
>>> alias
>>> expansion has to be performed while parsing the contents of the command
>>> substitution:
>>>
>>> "existing aliases are required to be expanded when the shell parses the
>>> input that follows the "$(" in order to find the terminating ')'"
>>>
>>
>> i see here only possible ) closing parsing, not doing so results in a mess
>>
>
> if, there is a closing loose ) in the alias, i assume its for usage there
> non legit cases are the invalid coded ones ( the 'wouldnt work anyway
> cause wrong know )
>
>>
>> for me, aliases as im bash ive experienced as text inplace replacements,
>> flat text, then the cmdline parsing is done, so closing ) if easily
>> specified by user yes works, else broken incomplete ( wrong pathed ) aliases
>>
>>>
>>> and (in the same interpretation):
>>>
>>> "Historically some shells used simple parenthesis counting to find the
>>> terminating ')' and therefore did not account for aliases. However, such
>>> shells never conformed to POSIX, which has always required recursive
>>> parsing (see XCU 2.3 item 5)."
>>>
>>> So this seems like behavior that should be conditional on posix mode to
>>> preserve backwards compatibility.
>>>
>>> --
>>> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>>>  ``Ars longa, vita brevis'' - Hippocrates
>>> Chet Ramey, UTech, CWRUc...@case.edu
>>> http://tiswww.cwru.edu/~chet/
>>>
>>>


Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Alex fxmbsw7 Ratchev
On Tue, Feb 1, 2022, 19:11 Chet Ramey  wrote:

> On 2/1/22 10:23 AM, Chet Ramey wrote:
>
> > If you defer alias expansion until execution, you lose the (posix-
> > encouraged but officially unspecified according to the approved
> > interpretation of issue 1342) ability to have aliases affect command
> > parsing in the command substitution:
>
> Well, I went back and read the entire interpretation. The part that's not
> specified is whether an alias expansion provides the closing `)', but alias
> expansion has to be performed while parsing the contents of the command
> substitution:
>
> "existing aliases are required to be expanded when the shell parses the
> input that follows the "$(" in order to find the terminating ')'"
>

i see here only possible ) closing parsing, not doing so results in a mess

for me, aliases as im bash ive experienced as text inplace replacements,
flat text, then the cmdline parsing is done, so closing ) if easily
specified by user yes works, else broken incomplete ( wrong pathed ) aliases

>
> and (in the same interpretation):
>
> "Historically some shells used simple parenthesis counting to find the
> terminating ')' and therefore did not account for aliases. However, such
> shells never conformed to POSIX, which has always required recursive
> parsing (see XCU 2.3 item 5)."
>
> So this seems like behavior that should be conditional on posix mode to
> preserve backwards compatibility.
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>
>


Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Alex fxmbsw7 Ratchev
On Tue, Feb 1, 2022, 19:16 Alex fxmbsw7 Ratchev  wrote:

>
>
> On Tue, Feb 1, 2022, 19:11 Chet Ramey  wrote:
>
>> On 2/1/22 10:23 AM, Chet Ramey wrote:
>>
>> > If you defer alias expansion until execution, you lose the (posix-
>> > encouraged but officially unspecified according to the approved
>> > interpretation of issue 1342) ability to have aliases affect command
>> > parsing in the command substitution:
>>
>> Well, I went back and read the entire interpretation. The part that's not
>> specified is whether an alias expansion provides the closing `)', but
>> alias
>> expansion has to be performed while parsing the contents of the command
>> substitution:
>>
>> "existing aliases are required to be expanded when the shell parses the
>> input that follows the "$(" in order to find the terminating ')'"
>>
>
> i see here only possible ) closing parsing, not doing so results in a mess
>

if, there is a closing loose ) in the alias, i assume its for usage there
non legit cases are the invalid coded ones ( the 'wouldnt work anyway cause
wrong know )

>
> for me, aliases as im bash ive experienced as text inplace replacements,
> flat text, then the cmdline parsing is done, so closing ) if easily
> specified by user yes works, else broken incomplete ( wrong pathed ) aliases
>
>>
>> and (in the same interpretation):
>>
>> "Historically some shells used simple parenthesis counting to find the
>> terminating ')' and therefore did not account for aliases. However, such
>> shells never conformed to POSIX, which has always required recursive
>> parsing (see XCU 2.3 item 5)."
>>
>> So this seems like behavior that should be conditional on posix mode to
>> preserve backwards compatibility.
>>
>> --
>> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>>  ``Ars longa, vita brevis'' - Hippocrates
>> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>>
>>


Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Alex fxmbsw7 Ratchev
On Tue, Feb 1, 2022, 19:25 Alex fxmbsw7 Ratchev  wrote:

> you'd expand an alias if seen and then reinterpret the whats gotten also
> to ( possible ) current cmdline before alias
>

as with multiple heredocs in complex cmds

{
cmd "$( < /dev/fd/4 )" "$( < /dev/fd/3 )"
} 4<
> its hard to limit this to one cmd per chunk only
>
> you, 1. expand alias 2. it expanded to multiple complex cmds 3. bash
> parses the resulting text and does its per cmd recieved works
>
> On Tue, Feb 1, 2022, 19:22 Alex fxmbsw7 Ratchev  wrote:
>
>>
>>
>> On Tue, Feb 1, 2022, 19:16 Alex fxmbsw7 Ratchev 
>> wrote:
>>
>>>
>>>
>>> On Tue, Feb 1, 2022, 19:11 Chet Ramey  wrote:
>>>
 On 2/1/22 10:23 AM, Chet Ramey wrote:

 > If you defer alias expansion until execution, you lose the (posix-
 > encouraged but officially unspecified according to the approved
 > interpretation of issue 1342) ability to have aliases affect command
 > parsing in the command substitution:

 Well, I went back and read the entire interpretation. The part that's
 not
 specified is whether an alias expansion provides the closing `)', but
 alias
 expansion has to be performed while parsing the contents of the command
 substitution:

 "existing aliases are required to be expanded when the shell parses the
 input that follows the "$(" in order to find the terminating ')'"

>>>
>>> i see here only possible ) closing parsing, not doing so results in a
>>> mess
>>>
>>
>> if, there is a closing loose ) in the alias, i assume its for usage there
>> non legit cases are the invalid coded ones ( the 'wouldnt work anyway
>> cause wrong know )
>>
>>>
>>> for me, aliases as im bash ive experienced as text inplace replacements,
>>> flat text, then the cmdline parsing is done, so closing ) if easily
>>> specified by user yes works, else broken incomplete ( wrong pathed ) aliases
>>>

 and (in the same interpretation):

 "Historically some shells used simple parenthesis counting to find the
 terminating ')' and therefore did not account for aliases. However, such
 shells never conformed to POSIX, which has always required recursive
 parsing (see XCU 2.3 item 5)."

 So this seems like behavior that should be conditional on posix mode to
 preserve backwards compatibility.

 --
 ``The lyf so short, the craft so long to lerne.'' - Chaucer
  ``Ars longa, vita brevis'' - Hippocrates
 Chet Ramey, UTech, CWRUc...@case.edu
 http://tiswww.cwru.edu/~chet/




Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Robert Elz
Date:Tue, 1 Feb 2022 10:23:53 -0500
From:Chet Ramey 
Message-ID:  <1e33f111-b9ff-2d70-adf8-934906321...@case.edu>

  | Historically, bash (and ksh93) has favored the former. Just about all the
  | other shells claiming some sort of POSIX conformance favor the latter (all
  | the ash-based shells, yash, mksh).

For ash bashed shells, at least, that's because the text in the command
substitution is not parsed twice - it is parsed (nb: just parsed, not expanded
or evaluated) once while scanning it to find the closing ')', and the results
of that parse are simply kept for later when it is to be used.

Since it isn't possible for anything to change the command substitution text
once scanning it has started, the results of a parse of its text done later
must be the same as that done while scanning (that is, no commands can 
possibly be executed between when that scan starts, and when the command
substitution is later expanded, assuming it ever is) so doing the parse all
over again is something of a waste of time.   Since  the command substitution
might never be expanded however, some bookkeeping is needed to make sure that
the resulting parse tree is discarded if it isn't used - but compared to the
work needed for an additional parse, that's trivial.

But if you're sure that the double parse method is needed, the simple fix
would be to save original command substitution text, not the alias expanded
form of it, in the word being constructed, and simply continue to expand the
alias both times (as above, since no commands can get executed between the
first and second parse, assuming the 2nd happens, the alias cannot be altered,
and it will expand the 2nd time the same way it did the first).

It seems to me that the way you're doing it now would also break:

alias x=y
cat <<$(x)
whatever
$(x)

which is supposed to work, and indeed, in the bash 5.2 development branch,
it does (the terminator after the here-doc text needs to be $(y) to work).

kre




Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Robert Elz
Date:Tue, 1 Feb 2022 15:39:06 -0500
From:Chet Ramey 
Message-ID:  <2816cf78-d7be-b9e1-733d-12427b04c...@case.edu>

  | When you say "just parsed," when are aliases expanded?

During lexical analysis, right between when the input is read (if it
is read, and isn't from some internal string) and when it is handed to
the grammar.

  | Are they expanded while scanning the command substitution to find the
  | closing `)' but not part of the text that results?

For the former half, yes.   And I think you're aware that we have no
"text that results" - which is why, among other things, <<$(anything)
doesn't work for us, as the $(anything) is a (kind of) a pointer to
a piece of parse tree, which never ends up matching anything even remotely
ascii.

Keeping the original text is something on my todo list, but it is complicated
by our memory management methods - I could just malloc(biggish) and then
realloc(2 * biggish) if biggish isn't enough, but I really would prefer not
to do that if I can avoid it (this is a temporary string, it doesn't need
semi-permanent storage, which is what malloc() is used for in our shell).

  | What form are the `results' kept in (I
  | assume a parse tree similar to a shell function)?

Very similar - shell functions start out identical, but are then converted
(slightly) into a more condensed form, as they hang around for a long time,
unlike command substitutions which last just as long as it takes to prepare
the current command.

  | Do they include expanded aliases or is that deferred?

Aliases (as evil as they are) need to be expanded to be able to generate
the correct parse tree - they cannot be deferred.

Consider the different tree you'd get for

cmd1 arg1; cmd2 arg2; cmd3

if that was parsed exactly as written, compared to what it would
look like if we had

alias cmd1=if
alias cmd2=then
alias cmd3=fi

  | That seems like the crux of the issue. If the command substitution is part
  | of a shell function definition, you only want to expand aliases the `first
  | time' -- at the time you parse the shell function.

Yes, aliases are always lexical.   The function is a tree from the parse,
there's nothing in the tree, ever, that has anything whatever to do with
aliases (aside from the "alias" and "unalias" commands themselves of course,
but those are just ordinary commands, and treated that way).

  | You can execute arbitrary commands, including alias definitions,
  | between the time the shell function is defined and the time it's executed.

Yes.   Too late for anything in the function (anything at all).

  | POSIX requires that aliases in command substitutions be expanded when
  | the function definition is parsed, not when the command substitution
  | is finally executed,

Yes, that's what we do - it is a consequence of the "parse everything when
it is first seen, and never again" philosophy.

  | but bash has not traditionally done it that way. That's where the backwards
  | compatibility issues come in.

Oh.  I see.   Does anyone really care though?   It is kind of hard to imagine
anyone being perverse enough to use an alias in a function anywhere, let
alone depend upon it changing between executions of a function when the alias
definition has been altered.   Anyone doing anything like that deserves to
have their code break IMO.

  | It's not true that `no commands can be executed'. The alias can be altered
  | by commands between a shell function definition and its execution.

Yes, I had forgotten that case.

  | I considered keeping both the original text and the parse tree from the
  | parsed command substitution (well, a chain of them since you can have an
  | arbitrary number of command substitutions in a word).

Yes, that's what we do.

  | It's difficult, given bash's internal structure, to preserve the
  | original text

Same for us, probably for different reasons though.

  | -- as opposed to the reconstituted text --

How accurately can you reconstitute?   That is, can you maintain the
difference between $(a b) and $( a b ) for example ?   How about $(a  b) ?

  | Other shells (bosh, mksh) also recreate the text of a command
  | substitution from the parsed commands.

Interesting, though I am not surprised in a way.   Actually needing the
command sub in textual form (aside from perhaps showing in output from
jobs or something, though I doubt it is ever needed even there - and for
that kind of thing, a good approximation is just fine in any case) is
very very rare in sh - the end word of a here-doc redirection operator
might be the only case.

  |
  | > It seems to me that the way you're doing it now would also break:
  | > 
  | >   alias x=y
  | >   cat <<$(x)
  | >   whatever
  | >   $(x)

  | This works in bash default mode because aliases aren't expanded while the
  | command is parsed.

I don't know whether I was in "default mode" or not (all I did was run 
"bash-5.2" which is the name where I stored the devel

Re: Incorrect alias expansion within command substitution

2022-02-01 Thread Chet Ramey

On 2/1/22 1:47 PM, Robert Elz wrote:

 Date:Tue, 1 Feb 2022 10:23:53 -0500
 From:Chet Ramey 
 Message-ID:  <1e33f111-b9ff-2d70-adf8-934906321...@case.edu>

   | Historically, bash (and ksh93) has favored the former. Just about all the
   | other shells claiming some sort of POSIX conformance favor the latter (all
   | the ash-based shells, yash, mksh).

For ash bashed shells, at least, that's because the text in the command
substitution is not parsed twice - it is parsed (nb: just parsed, not expanded
or evaluated) once while scanning it to find the closing ')', and the results
of that parse are simply kept for later when it is to be used.


When you say "just parsed," when are aliases expanded? Are they expanded
while scanning the command substitution to find the closing `)' but not
part of the text that results? What form are the `results' kept in (I
assume a parse tree similar to a shell function)? Do they include expanded
aliases or is that deferred?


Since it isn't possible for anything to change the command substitution text
once scanning it has started, the results of a parse of its text done later
must be the same as that done while scanning (that is, no commands can
possibly be executed between when that scan starts, and when the command
substitution is later expanded, assuming it ever is) so doing the parse all
over again is something of a waste of time. 


That seems like the crux of the issue. If the command substitution is part
of a shell function definition, you only want to expand aliases the `first
time' -- at the time you parse the shell function. You can execute 
arbitrary commands, including alias definitions, between the time the shell

function is defined and the time it's executed. POSIX requires that aliases
in command substitutions be expanded when the function definition is
parsed, not when the command substitution is finally executed, but bash has
not traditionally done it that way. That's where the backwards
compatibility issues come in.



Since  the command substitution
might never be expanded however, some bookkeeping is needed to make sure that
the resulting parse tree is discarded if it isn't used - but compared to the
work needed for an additional parse, that's trivial.


Yes, you can carry it around as part of a data structure.


But if you're sure that the double parse method is needed, the simple fix
would be to save original command substitution text, not the alias expanded
form of it, in the word being constructed, and simply continue to expand the
alias both times (as above, since no commands can get executed between the
first and second parse, assuming the 2nd happens, the alias cannot be altered,
and it will expand the 2nd time the same way it did the first).


It's not true that `no commands can be executed'. The alias can be altered
by commands between a shell function definition and its execution.

I considered keeping both the original text and the parse tree from the
parsed command substitution (well, a chain of them since you can have an
arbitrary number of command substitutions in a word). It's difficult,
given bash's internal structure, to preserve the original text -- as
opposed to the reconstituted text -- so I didn't do that. Other shells
(bosh, mksh) also recreate the text of a command substitution from the
parsed commands.


It seems to me that the way you're doing it now would also break:

alias x=y
cat <<$(x)
whatever
$(x)

which is supposed to work, and indeed, in the bash 5.2 development branch,
it does (the terminator after the here-doc text needs to be $(y) to work).


This works in bash default mode because aliases aren't expanded while the
command is parsed. The delimiter ends up being `$(x)'. Since you're
required to check the line read for the terminating delimiter before doing
anything else, the delimiter has to be $(x) to make it work.

It works with ksh93 as well, but every other shell produces an error of
some sort (including the NetBSD sh, unless you've changed something in the
couple of months since I last built it).

I can't see it working in any shell's posix mode if posix requires aliases
to be expanded while reading the WORD containing the command substitution
that is the here-doc delimiter. If that's the case, you have an alias
expansion mismatch, since I don't believe you're permitted to perform
alias expansion on the lines of the here-document as you read them, and
the resolution to bug 1036 makes it clear -- to me, at least -- that you
check for the delimiter before doing anything to the line.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/