On Fri, Nov 21, 2014 at 06:21:30AM +0800, konsolebox wrote: > On Thu, Nov 20, 2014 at 4:47 AM, Greg Wooledge <wool...@eeg.ccf.org> wrote: > > Passing the name of a variable has NEVER worked in ANY function for ANY > > purpose in bash. > > I hope you're actually not generalizing but you seriously have to reconsider > that.
Well, I did overstate the problem slightly. There are some scenarios where you can successfully pass the name of a variable as an argument to a function, but they are limited, and you must handle everything precisely right. None of the obvious or elegant approaches will work without extra safety checks and byzantine constructions. > This is just one: > https://sourceforge.net/p/playshell/code/ci/master/tree/source/filter.sh OK, right off the bat, you are using an undocumented HACK: function filter { local REF="$1[@]" X I don't remember whether we ever got official word from Chet on whether this syntax would continue to be supported in the future. It seems to me more of an accidental misfeature that people have latched onto because there is no other way to get things done. (Besides eval.) The fact that one must use this undocumented hack (which may or may not magically vanish the next time Chet tightens up the grammar) in order to retrieve values from an array whose name is passed as an argument is already one strike against it. > "eval" is not synonymous to security holes. There are many ways to sanitize > it > (if necessarily needed). Namerefs (declare -n in 4.3) and indirect variable > references for example can avoid referencing invalid variable names simply by > checking them with a pattern. In my previous message I demonstrated why declare -n is just as insecure as eval. In fact it's worse, because: 1) There's always one name that you cannot pass as an argument (the same name that the ref is planning to use). 2) It gives the ILLUSION that it works like ksh's nameref, and thus the ILLUSION of safety, which lulls the user into thinking he won't have to worry about security. > > This is precisely why I tell people not to try to write "libraries" of > > reusable code in bash. You just can't do it. It's pointless to try. > > Sorry but it's not. Or at least most of the time it's not always. There are some simple things you can write reusably, without doing backflips. For example, I have a function that "returns" a random number: # Pick unbiased random number from 0 to N-1 ($1 = N) # Returns value in global variable r. rand() { local max=$((32768 / $1 * $1)) while (( (r=$RANDOM) >= max )); do :; done r=$(( r % $1 )) } This is right around the complexity limit that I would consider reasonable for a bash function. Anything beyond this level, and you're really getting into hot water. > On Thu, Nov 20, 2014 at 5:14 AM, Greg Wooledge <wool...@eeg.ccf.org> wrote: > > This is because you DO NOT WRITE LIBRARIES IN BASH. You can't. This > > issue is just one of so very many pitfalls. It's an intractable problem. > > Granting that it's always the library writer that should care about how the > functions should be called. The scripter also has the responsibility of > making > sure that the type of argument the script is passing to the function is > correct. There's another concern in shell scripts, though, which is absent in many other languages: arbitrary code execution. It's not just a matter of whether the library function gives the correct results, or a reasonable error message if you pass the "wrong thing" to it. If you try to implement a complex function in bash (or almost any other shell), the odds are the user can BREAK THINGS (cause arbitrary code execution) by passing the "wrong thing" as an argument. There is a whole extra class of issues that one must worry about, and it's REALLY difficult to write a function that not only performs the desired task correctly, but also does so without introducing new attack vectors against the operating system. I gave this example in my previous message, but it bears repeating: blackbox() { declare -n arg=$1 printf '%s\n' "$arg" } This looks completely innocuous, but it's NOT. declare -n is a trap for the poor unwary programmer. imadev:~$ blackbox 'a[b=$(date)]' bash: b=Fri Nov 21 15:55:58 EST 2014: syntax error in expression (error token is "Nov 21 15:55:58 EST 2014") If you want to use declare -n, you have to write several lines of safety checking every time: blackbox() { if [[ $1 = arg ]]; then echo "blackbox: 'arg' is not permitted, because reasons" >&2 return 1 elif [[ $1 != [[:alpha:]_]*([[:alnum:]_]) ]]; then echo "blackbox: argument must be a non-indexed identifier" >&2 return 1 fi declare -n arg=$1 ... } There's simply no other way to be secure. Every feature that bash has which even comes close to this has the same issues. imadev:~$ var='a[b=$(date)]' imadev:~$ printf -v "$var" %s foo bash: b=Fri Nov 21 16:00:51 EST 2014: syntax error in expression (error token is "Nov 21 16:00:51 EST 2014") So, printf -v isn't safe either. You have to sanitize the argument of -v for exactly the same reason you have to sanitize the variable pointed to by declare -n, or any variable that you expand with eval. If you don't, you introduce an opportunity for arbitrary code execution. So, if you want to write a function that populates a (scalar!) variable of the caller's choosing, here's one way to do it: # Add a bunch of integers and store the result # varname may not be __i or __sum addemup() { (($#)) || { echo "usage: addemup varname [int ...]" >&2; return 1; } # optionally check for $1 = __i or __sum if [[ $1 != [[:alpha:]_]*([[:alnum:]_]) ]]; then echo "addemup: varname must be a non-indexed identifier" >&2; return 1 fi local __sum=0 __i for __i in "${@:2}"; do ((__sum+=__i)); done eval "$1=\$__sum" } And sure enough, you mustn't pass __i or __sum to it: imadev:~$ addemup __i 1 3 imadev:~$ echo $__i The problem is bash has no way to refer to a variable in the caller's scope. As soon as you have any local variable, it shadows the caller's variable of the same name. The only way around this limitation that I have EVER heard of is described by Freddy Vulto at http://www.fvue.nl/wiki/Bash:_Passing_variables_by_reference I have read that page a few times, and I STILL don't understand how it works. Nevertheless, here is a solution using Freddy's helper function: # Assign variable one scope above the caller # Usage: local "$1" && upvar $1 "value(s)" # Param: $1 Variable name to assign value to # Param: $* Value(s) to assign. If multiple values, an array is # assigned, otherwise a single value is assigned. # NOTE: For assigning multiple variables, use 'upvars'. Do NOT # use multiple 'upvar' calls, since one 'upvar' call might # reassign a variable to be used by another 'upvar' call. # Example: # # f() { local b; g b; echo $b; } # g() { local "$1" && upvar $1 bar; } # f # Ok: b=bar # upvar() { if unset -v "$1"; then # Unset & validate varname if (( $# == 2 )); then eval $1=\"\$2\" # Return single value else eval $1=\(\"\${@:2}\"\) # Return array fi fi } addemup() { (($#)) || { echo "usage: addemup varname [int ...]" >&2; return 1; } if [[ $1 != [[:alpha:]_]*([[:alnum:]_]) ]]; then echo "addemup: varname must be a non-indexed identifier" >&2; return 1 fi local sum=0 i for i in "${@:2}"; do ((sum+=i)); done local "$1" && upvar "$1" "$sum" } It's pure voodoo! You need a dozen lines of code, including a function that no normal person can understand that comes with its own warning label, just to put a value into a variable safely! Why would you do this to yourself? Just switch to a real programming language.