Re: Arithmetic + array allows for code injection
On Fri, May 30, 2014 at 09:28:13PM -0500, Dan Douglas wrote: > The problem is most people don't realize how "variables" are evaluated. > Any time the shell needs to reference a variable, it takes a string > like: "arr[$foo]" and, if there's an index, the string within the index > gets processed for expansions. The arithmetic evaluator is no exception. I'm trying to understand this, but it's not clear to me yet. imadev:~$ x='$(date)' imadev:~$ : $(($x)) bash: $(date): syntax error: operand expected (error token is "$(date)") That looks OK. imadev:~$ : $((a[$x])) bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error token is "Jun 2 08:06:39 EDT 2014") There's the code-injection problem that started the thread. imadev:~$ : ${a[$x]} bash: $(date): syntax error: operand expected (error token is "$(date)") That also looks OK. Why is there no code injection in the last example? There is an index. According to your paragraph, "... the string within the index gets processed for expansions. The arithmetic evaluator is no exception." If that's true, then I would have expected both the second and third examples to behave the same way. > The correct way to write such a thing is to let the variable evaluation > expand the parameter from the arithmetic evaluator: > > $ bash -c 'typeset -A a; i=\" a[$i]=1+1; echo "$((a[\$i]))"' > 2 > $ ksh -c 'typeset -A a; i=\" a[$i]=1+1; echo "$((a[\$i]))"' > 2 > $ zsh -c 'typeset -A a; i=\" a[$i]=1+1; echo "$((a[\$i]))"' > 2 So you're saying that instead of $((a[$x])) one should use $((a[\$x])) ? imadev:~$ declare -A a imadev:~$ x='$(date >&2)' imadev:~$ a[$x]=5 imadev:~$ echo $((1+a[$x])) Mon Jun 2 08:19:12 EDT 2014 bash: a: bad array subscript 1 imadev:~$ echo $((1+a[\$x])) 6 imadev:~$ echo "1+${a[$x]}" 1+5 imadev:~$ echo "1+${a[\$x]}" 1+ I still don't understand. But it seems clear that putting the indexed parameter expansion inside $((...)) changes things. In some mysterious way.
Re: Arithmetic + array allows for code injection
Greg Wooledge writes: > imadev:~$ : $((a[$x])) > bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error token is > "Jun 2 08:06:39 EDT 2014") > > There's the code-injection problem that started the thread. Here the index is '$(date)'. *Note (bash) Arithmetic Expansion:: ... All tokens in the expression undergo parameter and variable expansion, command substitution, and quote removal. The result is treated as the arithmetic expression to be evaluated. > imadev:~$ : ${a[$x]} > bash: $(date): syntax error: operand expected (error token is "$(date)") > > That also looks OK. Here the index is '$x'. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: Arithmetic + array allows for code injection
On 6/2/14, 8:21 AM, Greg Wooledge wrote: > On Fri, May 30, 2014 at 09:28:13PM -0500, Dan Douglas wrote: >> The problem is most people don't realize how "variables" are evaluated. >> Any time the shell needs to reference a variable, it takes a string >> like: "arr[$foo]" and, if there's an index, the string within the index >> gets processed for expansions. The arithmetic evaluator is no exception. > > I'm trying to understand this, but it's not clear to me yet. > > imadev:~$ x='$(date)' > imadev:~$ : $(($x)) > bash: $(date): syntax error: operand expected (error token is "$(date)") > > That looks OK. > > imadev:~$ : $((a[$x])) > bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error token is > "Jun 2 08:06:39 EDT 2014") > > There's the code-injection problem that started the thread. > > imadev:~$ : ${a[$x]} > bash: $(date): syntax error: operand expected (error token is "$(date)") > > That also looks OK. > > Why is there no code injection in the last example? There is an index. > According to your paragraph, "... the string within the index gets > processed for expansions. The arithmetic evaluator is no exception." The arithmetic evaluator is, in fact, an exception. That, combined with the expansions that happen before the arithmetic evaluator gets hold of the expression -- and it is an expression -- leads to the difference. In the first case, the arithmetic evaluator sees `a[$(date)]' as the expression after parameter expansion is performed: "All tokens in the expression undergo parameter and variable expansion, command substitution, and quote removal. The result is treated as the arithmetic expression to be evaluated." Since that expression looks like a variable expansion, the following sentence in the description of arithmetic evaluation is applicable: "Within an expression, shell variables may also be referenced by name without using the parameter expansion syntax. A shell variable that is null or unset evaluates to 0 when referenced by name without using the parameter expansion syntax." The a[$(date)] is identified as an array index, so the $(date) is expanded like any other index, and evaluated as an expression. This is what lets you use things like 'x+1' and 'x[y+1]' in arithmetic expansions. The parameter expansion example (${a[$x]}) doesn't undergo that `extra' expansion. The index that ends up being passed to the evaluator is `$x', which is expanded to `$(date)'. That is treated as an expression and evaluation fails. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Arithmetic + array allows for code injection
On Mon, Jun 2, 2014 at 4:44 PM, Chet Ramey wrote: > On 6/2/14, 8:21 AM, Greg Wooledge wrote: > > On Fri, May 30, 2014 at 09:28:13PM -0500, Dan Douglas wrote: > >> The problem is most people don't realize how "variables" are evaluated. > >> Any time the shell needs to reference a variable, it takes a string > >> like: "arr[$foo]" and, if there's an index, the string within the index > >> gets processed for expansions. The arithmetic evaluator is no exception. > > > > I'm trying to understand this, but it's not clear to me yet. > > > > imadev:~$ x='$(date)' > > imadev:~$ : $(($x)) > > bash: $(date): syntax error: operand expected (error token is "$(date)") > > > > That looks OK. > > > > imadev:~$ : $((a[$x])) > > bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error > token is "Jun 2 08:06:39 EDT 2014") > > > > There's the code-injection problem that started the thread. > > > > imadev:~$ : ${a[$x]} > > bash: $(date): syntax error: operand expected (error token is "$(date)") > > > > That also looks OK. > > > > Why is there no code injection in the last example? There is an index. > > According to your paragraph, "... the string within the index gets > > processed for expansions. The arithmetic evaluator is no exception." > > The arithmetic evaluator is, in fact, an exception. That, combined with > the expansions that happen before the arithmetic evaluator gets hold of > the expression -- and it is an expression -- leads to the difference. > > In the first case, the arithmetic evaluator sees `a[$(date)]' as the > expression after parameter expansion is performed: > > "All tokens in the expression undergo parameter and variable expansion, > command substitution, and quote removal. The result is treated as the > arithmetic expression to be evaluated." > > Since that expression looks like a variable expansion, the following > sentence in the description of arithmetic evaluation is applicable: > > "Within an expression, shell variables may also be referenced by name > without using the parameter expansion syntax. A shell variable that > is null or unset evaluates to 0 when referenced by name without using the > parameter expansion syntax." > > The a[$(date)] is identified as an array index, so the $(date) is expanded > like any other index, and evaluated as an expression. > > This is what lets you use things like 'x+1' and 'x[y+1]' in arithmetic > expansions. > > The parameter expansion example (${a[$x]}) doesn't undergo that `extra' > expansion. The index that ends up being passed to the evaluator is `$x', > which is expanded to `$(date)'. That is treated as an expression and > evaluation fails. > > Chet > -- > ``The lyf so short, the craft so long to lerne.'' - Chaucer > ``Ars longa, vita brevis'' - Hippocrates > Chet Ramey, ITS, CWRUc...@case.edu > http://cnswww.cns.cwru.edu/~chet/ > Even if there is a perfectly good justification as to why this works, I still think this is a terribly broken feature of the language. The number of shell scripters, even experimented, that will realize that $((a["$i"])) makes code injection possible is probably very close to 0 while the first thing an script kid would do to create trouble is to embed $( ) in his strings.
Re: Arithmetic + array allows for code injection
On 6/2/14, 9:34 AM, Greg Wooledge wrote: > (One could argue that POSIX's wording doesn't require the command > substitution be done in a second pass AFTER the parameter expansion. > But apparently it has been interpreted this way.) Posix doesn't have arrays, and so doesn't concern itself with how array indices are expanded. It does, however, only require that variables whose value looks like an integer constant be expanded within expressions. Bash has never done that; the variable's value is treated as an expression. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Arithmetic + array allows for code injection
On Mon, Jun 02, 2014 at 03:08:17PM +0200, Andreas Schwab wrote: > Greg Wooledge writes: > > > imadev:~$ : $((a[$x])) > > bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error token > > is "Jun 2 08:06:39 EDT 2014") > > > > There's the code-injection problem that started the thread. > > Here the index is '$(date)'. > > *Note (bash) Arithmetic Expansion:: ... All tokens in the expression > undergo parameter and variable expansion, command substitution, and > quote removal. The result is treated as the arithmetic expression to be > evaluated. Ah. And this is copied almost verbatim from POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04 $((expression)) The expression shall be treated as if it were in double-quotes, except that a double-quote inside the expression is not treated specially. The shell shall expand all tokens in the expression for parameter expansion, command substitution, and quote removal. So the reason it acts this way is because POSIX said so. Now it starts to make sense! (One could argue that POSIX's wording doesn't require the command substitution be done in a second pass AFTER the parameter expansion. But apparently it has been interpreted this way.)
Re: Arithmetic + array allows for code injection
If you want to write robust scripts, don't use shell. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: Arithmetic + array allows for code injection
On Jun 2, 2014, at 9:34 AM, Greg Wooledge wrote: > On Mon, Jun 02, 2014 at 03:08:17PM +0200, Andreas Schwab wrote: >> Greg Wooledge writes: >> >>> imadev:~$ : $((a[$x])) >>> bash: Mon Jun 2 08:06:39 EDT 2014: syntax error in expression (error token >>> is "Jun 2 08:06:39 EDT 2014") >>> >>> There's the code-injection problem that started the thread. >> >> Here the index is '$(date)'. >> >> *Note (bash) Arithmetic Expansion:: ... All tokens in the expression >> undergo parameter and variable expansion, command substitution, and >> quote removal. The result is treated as the arithmetic expression to be >> evaluated. > > Ah. And this is copied almost verbatim from POSIX: > > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04 > > $((expression)) > > The expression shall be treated as if it were in double-quotes, except that a > double-quote inside the expression is not treated specially. The shell shall > expand all tokens in the expression for parameter expansion, command > substitution, and quote removal. > > So the reason it acts this way is because POSIX said so. Now it starts > to make sense! > > (One could argue that POSIX's wording doesn't require the command > substitution be done in a second pass AFTER the parameter expansion. > But apparently it has been interpreted this way.) As such, the bug is that the expansions triggered by $(( )) are IMPLICITLY re-evaluated by [ ]. To summarize, index='$( echo >&2 foo )' # Literal shell code should never be evaluated unless an 'eval' is involved. echo ${array[ $index ]} # [] expands $index, results in a literal that [] does not re-evaluate. echo $(( $index )) # (( )) expands $index, results in a literal that (( )) does not re-evaluate. echo $(( array[ $index ] )) # (( )) expands $index, results in a literal that [] DOES re-evaluate. Whether or not the documentation justifies the above behaviour, I think we can agree on the fact that no user will expect or even desire the behaviour of the latter, never mind that it is dangerous. As such, it certainly is a bug and should be corrected. I think it's fair to trust, as an author, that shell code will only be re-evaluated when there's an explicit eval, source, or similar statement declaring such behaviour. As a side note, expanding literal words as variables such as happens in ${array[ index ]} or the above code with index=foo is probably acceptable so long as the value that results from this expansion is treated purely as data (though bash does allow recursion on the expansion of literal variable names here). The point being that in my opinion, bash should as a matter of principle prohibit concealed code execution wherever possible; failing that it will be nearly impossible to write safe shell code - or at least trust that your shell code is safe. On Jun 2, 2014, at 10:28 AM, Andreas Schwab wrote: > If you want to write robust scripts, don't use shell. > > Andreas. What does this comment justify? Other than stopping all maintenance on bash today.
bug-bash@gnu.org
> From: Chet Ramey > > The two words don't seem me to be used interchangeably in the Bash > > manual page. "evaluate" is used only for obtaining the value of > > "arithmetic evaluations" and "conditional expressions", whereas > > "expand" is used for operations that are strictly textual > > replacements. > > I cribbed the language pretty closely from the Posix standard, which > uses `evaluates' in this fashion. It's your call, of course. But being trained as a mathematician, I pay attention to what the specific meanings of words are. And the Bash manual page uses "expand" and "evaluate" in the ways I've described in almost all places. If you don't care to keep the meanings of the two words distinct, feel free. Dale
bash segfaults on EOF-delimited [^D] 'here-documents' (instead of aborting command?) when PS2 is not a static value (PS3 might have problems too)
The version number of Bash. GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu) The hardware and operating system. Linux localhost 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux The compiler used to compile Bash. A description of the bug behaviour. A short script or ‘recipe’ which exercises the bug and may be used to reproduce it.(see below for bashbug and remaining report) Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALED$ uname output: Linux larnica 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.3 Patch Level: 11 Release Status: release Description: Seems that the shell crashes when the value of PS2 is set to something that is evaluated when displayed, and the user presses (usually by accident) CTRL+D (EOF) when entering extra lines, especially "here documents". It also seems that the current line must be EMPTY when Ctrl+D is pressed by accident, if there is (usually) keystrokes already entered, then the crash does not happen and it works as expected (which is probably why it has taken so long for me to figure out what was doing it). Now I understand that Ctrl+D is bound to also exit the shell however the two issues are the crashing (segmentation fault), and the continuity of behavior dependent upon the number of characters entered. I said (usually) earlier because there are some times (not sure why) that there are a few characters on the line, but I have not been able to reproduce this only two times, so I'm not sure what that's all about. Repeat-By: bash --norc --noprofile (also tried running from inside /usr/bin/env but it acts the same) ubuntu@localhost:~$ PS2='$(echo "enter something>")' ubuntu@localhost:~$ cat <<-EOF enter something>here's some text enter something>pressing ctrl+d on the next line enter something> bash: warning: here-document at line 30 delimited by end-of-file (wanted `EOF') Segmentation fault (core dumped) That's about it. I think the fact that I use 's AND "s in the evaluation also causes problems because I tried without the 's and had no problems, but it's to early to tell if it really is okay since I haven't tested that extensively. I also wonder if this has anything to do with PS3 not being evaluated at all when it is used for builtins like 'select': PS3="$(echo something)" will print out: ubuntu@localhost:~$ select i in 1 2; do true; done 1. selection1 2. selection2 $(echo something) Fix: The best workaround of course is to use a static value for PS2, like the default.