[issue39692] Subprocess using list vs string

2020-02-19 Thread Niklas Smedemark-Margulies


New submission from Niklas Smedemark-Margulies :

Most (all?) of the functions in subprocess (run, Popen, etc) are supposed to 
accept either list or string, but the behavior when passing a list differs (and 
appears to be wrong).

For example, see below - invoking the command "exit 1" should give a return 
code of 1, but when using a list, the return code is 0.


```
>>> import subprocess


>>> # Example using run
>>> res1 = subprocess.run('exit 1', shell=True)
>>> res1.returncode
1
>>> res2 = subprocess.run('exit 1'.split(), shell=True)
>>> res2.returncode
0


>>> # Example using Popen
>>> p1 = subprocess.Popen('exit 1', shell=True)
>>> p1.communicate()
(None, None)
>>> p1.returncode
1
>>> p2 = subprocess.Popen('exit 1'.split(), shell=True)
>>> p2.communicate()
(None, None)
>>> p2.returncode
0
```

--
messages: 362294
nosy: nik-sm
priority: normal
severity: normal
status: open
title: Subprocess using list vs string
type: behavior
versions: Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue39692>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39692] Subprocess using list vs string

2020-02-20 Thread Niklas Smedemark-Margulies


Niklas Smedemark-Margulies  added the comment:

Thanks very much for getting back to me so quickly, and for identifying the 
reason for the difference in behavior.

Sorry to harp on a relatively small behavior, but it cost me a few hours and it 
might cause confusion for others as well.

It still seems like an oversight that the body of a program invoked by `bash 
-c` would not be quoted. Consider the following two examples:

$ bash -c echo my critical data > file.txt
$ cat file.txt 

$ # My data was lost!

Or again in Python:

>>> import subprocess
>>> res1 = subprocess.run(['echo', 'my', 'critical', 'data', '>', 'file.txt'], 
>>> shell=True, capture_output=True)
>>> res1.returncode
0
>>> exit()
$ cat file.txt
cat: file.txt: No such file or directory
$ # The file is not even created!



I know that the subsequent args are stored as bash arguments to the first 
executable/quoted program, for example:

$ bash -c 'echo $0' foo
foo

or

>>> res1 = subprocess.run(['echo $0', 'foo'], shell=True, capture_output=True)
>>> res1.stdout
b'foo\n'


However, it seems strange/wrong to invoke an executable via "bash -c executable 
arg1 arg2", rather than just "executable arg1 arg2"! In other words, the 
combination of `shell=True` with a sequence of args appears to behave 
surprisingly/wrong.


---


Here's the only part of the docs I could find that discuss the interaction 
between `shell=True` and args.:
"""
The shell argument (which defaults to False) specifies whether to use the shell 
as the program to execute. If shell is True, it is recommended to pass args as 
a string rather than as a sequence.
"""



I think there are ~2 cases here:

1) If there exist use cases for setting `shell=True` and doing "bash -c 
my_executable arg2 arg3", then the documentation should say something like the 
following:
"""
Using `shell=True` invokes the sequence of args via `bash -c`. In this case, 
the first argument MUST be an executable, and the subsequent arguments will be 
stored as bash parameters for that executable (`$0`, `$1`, etc).
"""

2) The body of the program invoked with `bash -c` should always be quoted. In 
this case, there should either be a code fix to quote the body, or a 
`ValueError` when `shell=True` and args is a sequence.


How does this sound from your perspective?

--

___
Python tracker 
<https://bugs.python.org/issue39692>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39692] Subprocess using list vs string

2020-02-20 Thread Niklas Smedemark-Margulies


Niklas Smedemark-Margulies  added the comment:

Good point - the phrasing I suggested there is not accurate, and there is more 
complicated behavior available than simply specifying a single executable. 
Here's the bash manual's info about "-c" flag:

"""
If the -c option is present, then commands are read from the first non-option 
argument command_string.   If  there  are arguments  after  the  
command_string, the first argument is assigned to $0 and any remaining 
arguments are assigned to the positional parameters.  The assignment to $0 sets 
the name of the shell, which is used in warning  and  error  messages.
"""

So the command_string provided (the first word or the first quoted expression) 
is interpreted as a shell program, and this program is invoked with the 
remaining words as its arguments. As you point out, this command_string can be 
a terminal expression like `true`, a function definition like you provided, an 
executable, or other possibilities, but in any case it will be executed with 
the remaining args.

(This also matches how the library code assigns `executable`: 
https://github.com/python/cpython/blob/master/Lib/subprocess.py#L1707)

As you say, simply slapping quotes around all the args produces a subtle 
difference: the arg in the position of `$0` is used as an actual positional 
parameter in one case, and as the shell name in the other case:

$ bash -c 'f() { printf "%s\n" "$@"; }; f "$@"' - foo bar baz
foo
bar
baz
 $ bash -c 'f() { printf "%s\n" "$@"; }; f "$@" - foo bar baz'
-
foo
bar
baz

(Unless I am misunderstanding the behavior here).

It's a bit frustrating that this approach would not work to simplify the usage, 
but (assuming my explanation is correct) I concede that code might certainly be 
depending on this behavior and setting the shell name with args[1] (and they 
would not want this to become a positional parameter instead).


Improving on my first attempt, here's another possible phrasing for the docs:
"""
Using `shell=True` invokes the sequence of args via ` -c` where  
is the chosen system shell (described elsewhere on this page). In this case, 
the item at args[0] is a shell program, that will be invoked on the subsequent 
args. The item at args[1] will be stored in the shell variable `$0`, and used 
as the name of the shell. The subsequent items at args[2:] will be stored as 
shell parameters (`$1`, `$2`, etc) and available as positional parameters (e.g. 
using `echo $@`).
"""

I would certainly be happy to defer on giving a precise and thorough statement 
for the docs, but clarifying/highlighting this behavior definitely seems useful.

Thanks again

--

___
Python tracker 
<https://bugs.python.org/issue39692>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com