Documentation issue

2017-10-26 Thread Eli Barzilay
Bash surprised me with the behavior mentioned here:

https://stackoverflow.com/questions/15897473

This can be pretty bad in that it's very unexpected (see the comments).
Also, the surprise can be triggered without nullglob as well:

$ foo=(a b c)
$ touch foo0
$ unset foo[0]
$ echo ${foo[*]}
a b c

The thing is that AFAICT, there is no mention of this pitfall in the man
page...  It would be nice to mention using quotes in at least the
`unset` description, and possibly also about `nullglob` too since it
makes it easier to run into this problem.

I grepped through the bash sources, and even there I found a few unsafe
uses:

grep -r 'unset[^a-z"'\'']*\[' examples tests

so this is clearly something that is not well-known enough.

-- 
   ((x=>x(x))(x=>x(x)))  Eli Barzilay:
   http://barzilay.org/  Maze is Life!



Re: Documentation issue

2017-10-26 Thread Pierre Gaston
On Thu, Oct 26, 2017 at 8:18 AM, Eli Barzilay  wrote:

> Bash surprised me with the behavior mentioned here:
>
> https://stackoverflow.com/questions/15897473
>
> This can be pretty bad in that it's very unexpected (see the comments).
> Also, the surprise can be triggered without nullglob as well:
>
> $ foo=(a b c)
> $ touch foo0
> $ unset foo[0]
> $ echo ${foo[*]}
> a b c
>
> The thing is that AFAICT, there is no mention of this pitfall in the man
> page...  It would be nice to mention using quotes in at least the
> `unset` description, and possibly also about `nullglob` too since it
> makes it easier to run into this problem.
>
> I grepped through the bash sources, and even there I found a few unsafe
> uses:
>
> grep -r 'unset[^a-z"'\'']*\[' examples tests
>
> so this is clearly something that is not well-known enough.
>
> --
>((x=>x(x))(x=>x(x)))  Eli Barzilay:
>http://barzilay.org/  Maze is Life!
>
>
I think it's even more likely to happen with eg: read array[i]

There is a large number of pitfalls in bash (
http://mywiki.wooledge.org/BashPitfalls) that most people ignore.
I'm not sure where to rank this one in the priorities of the ones that
would need to be mentioned in the manual.
Maybe one could create a separate man/info page that the manual could
reference?


Re: Documentation issue

2017-10-26 Thread Chet Ramey
On 10/26/17 1:18 AM, Eli Barzilay wrote:
> Bash surprised me with the behavior mentioned here:
> 
> https://stackoverflow.com/questions/15897473
> 
> This can be pretty bad in that it's very unexpected (see the comments).

I'm not sure why this is a surprise. Pathname expansion (globbing) is one
of the word expansions performed before a simple command is executed. The
`unset' builtin is no different.


> The thing is that AFAICT, there is no mention of this pitfall in the man
> page...  

"The  unset  builtin  is  used to destroy arrays.  unset name[subscript]
destroys the array element at index subscript.  Negative subscripts  to
indexed  arrays are interpreted as described above.  Care must be taken
to avoid unwanted side effects caused  by  pathname  expansion."

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: ctrl-w oddity on bash-4.4

2017-10-26 Thread Chet Ramey
On 10/26/17 12:21 AM, Robert Elz wrote:
> Date:Wed, 25 Oct 2017 10:45:11 -0400
> From:Chet Ramey 
> Message-ID:  <6751ad10-cccb-0467-a751-c5be8e745...@case.edu>
> 
>   | If you read the discussion in the thread I pointed to last night, `real'
>   | vi supposedly does this kind of thing. I'm not enough of a vi user to
>   | say one way or the other.
> 
> In real vi, ^W (word kill) only works at all on text you have currently
> typed in insert mode, there is no concept of moving somewhere, entering
> insert mode, and then using ^W to delete backwards, that would be a
> totally foreign concept to a vi user.

OK. Posix doesn't make that distinction. If you're in insert mode, ^W
deletes a `word'. I assume it's more trying to emulate the behavior of
the tty driver than `real' vi. However, the word boundary characters are
(again, I assume) more like `real' vi than the ones the tty driver uses.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Documentation issue

2017-10-26 Thread Eli Barzilay
Pierre Gaston:
> I think it's even more likely to happen with eg: read array[i]

Maybe, but see below.

> There is a large number of pitfalls in bash
> (http://mywiki.wooledge.org/BashPitfalls) that most people ignore.

And it doesn't even mention the unset problem, I think.


On Thu, Oct 26, 2017 at 8:56 AM, Chet Ramey  wrote:
> On 10/26/17 1:18 AM, Eli Barzilay wrote:
>> Bash surprised me with the behavior mentioned here:
>>
>> https://stackoverflow.com/questions/15897473
>>
>> This can be pretty bad in that it's very unexpected (see the
>> comments).
>
> I'm not sure why this is a surprise. Pathname expansion (globbing) is
> one of the word expansions performed before a simple command is
> executed. The `unset' builtin is no different.

The last sentence is showing why it's a surprise: it is confusing since
it easy to think that unset is special, similar to languages which have
something like `delete foo[1]` where the thing that follows delete is an
lvalue.


>> The thing is that AFAICT, there is no mention of this pitfall in the
>> man page...
>
> "The  unset  builtin  is  used to destroy arrays.  unset name[subscript]
> destroys the array element at index subscript.  Negative subscripts  to
> indexed  arrays are interpreted as described above.  Care must be taken
> to avoid unwanted side effects caused  by  pathname  expansion."

1. This is much more indirect than a simple "always quote array
   references";

2. I completely missed it since it's not in the place which describes
   unset.

(BTW, when I did dare for the first time to use unset on an array I did
go through the unset description, and got a vague impression that it's
kind of doing the special lvalue thing, so possibly the indirect warning
would have been sufficient to slap me back into the bash reality.)

-- 
   ((x=>x(x))(x=>x(x)))  Eli Barzilay:
   http://barzilay.org/  Maze is Life!



Re: Documentation issue

2017-10-26 Thread Chet Ramey
On 10/26/17 11:28 AM, Eli Barzilay wrote:

>> I'm not sure why this is a surprise. Pathname expansion (globbing) is
>> one of the word expansions performed before a simple command is
>> executed. The `unset' builtin is no different.
> 
> The last sentence is showing why it's a surprise: it is confusing since
> it easy to think that unset is special, similar to languages which have
> something like `delete foo[1]` where the thing that follows delete is an
> lvalue.

I understand. There are plenty of misconceptions out there. But the bash
documentation has never implied that `unset' is special in that way, and
it's not the man page's place to say everything the shell is not.

>>> The thing is that AFAICT, there is no mention of this pitfall in the
>>> man page...
>>
>> "The  unset  builtin  is  used to destroy arrays.  unset name[subscript]
>> destroys the array element at index subscript.  Negative subscripts  to
>> indexed  arrays are interpreted as described above.  Care must be taken
>> to avoid unwanted side effects caused  by  pathname  expansion."
> 
> 1. This is much more indirect than a simple "always quote array
>references";

Because it's much more general than a blanket statement like that, and the
man page isn't the place for those statements. That's the job for a shell
programming guide, of which there are plenty.

> 
> 2. I completely missed it since it's not in the place which describes
>unset.
> 
> (BTW, when I did dare for the first time to use unset on an array I did
> go through the unset description, and got a vague impression that it's
> kind of doing the special lvalue thing, so possibly the indirect warning
> would have been sufficient to slap me back into the bash reality.)

It's more of a general statement about arrays, though it appears in the
paragraph that discusses unset, so it's in the man page section on arrays.
You have to be careful about putting the same information in too many
different places -- the man page is big enough already.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Documentation issue

2017-10-26 Thread Eli Barzilay
On Thu, Oct 26, 2017 at 2:02 PM, Chet Ramey  wrote:
>
> It's more of a general statement about arrays, though it appears in
> the paragraph that discusses unset, so it's in the man page section on
> arrays.  You have to be careful about putting the same information in
> too many different places -- the man page is big enough already.

I'm very aware of man page bloat and the fact that the bash page is very
long as is.  But (a) I think that the builtin section for each command
is the more important place for such things; and (b) what I'm suggesting
is just a short reminder sentence about the need for quoting unsets for
array elements.

To make it more concrete, I think that the following change will be
good:

1. Drop the current "Care must be taken ... the entire array." two
   sentences and replace them with some "See the unset builtin
   description below".

2. Add those two sentences to the unset builtin section, as a new
   paragraph.

3. Tweak it slightly to explicitly encourage quoting:

   Care must be taken to avoid unwanted side effects caused by
   pathname expansion, (i.e., prefer unset "name[subscript]" to
   avoid such problems).  unset name, where name is an array, or
   unset name[subscript], where subscript is * or @, removes the
   entire array.

Note that there are two very small additions to the overall text:

1. A "see also" reference (to ensure that people don't miss it from
   either place).

2. The parenthetical comment.

-- 
   ((x=>x(x))(x=>x(x)))  Eli Barzilay:
   http://barzilay.org/  Maze is Life!



Re: ctrl-w oddity on bash-4.4

2017-10-26 Thread Robert Elz
Date:Thu, 26 Oct 2017 10:04:54 -0400
From:Chet Ramey 
Message-ID:  

  | Posix doesn't make that distinction.

Actually, it does,

Input Mode Commands in vi

In text input mode, the current line shall consist of zero or more
of the following categories, plus the terminating :

1, Characters preceding the text input entry point

   Characters in this category shall not be modified during text
   input mode.

2. [...]

In vi, input mode is used to input data.  That's it.   The only editing
that is supposed to be available is that which you would have when typing
using the regular tty driver input correction abilities, which apply only
to making changes on characters typed since the last \n - and only to erasing
backwards.

kre




Re: ctrl-w oddity on bash-4.4

2017-10-26 Thread Robert Elz
I should have also said that's there's no requirement (that I can see
anyway) that vi mode in readline be the same as vi.

In fact, it obviously cannot be, it is just an editing mode that is
somewhat similar to vi.  There are going to be things missing (clearly there's
no way, and no need, to center the current line in the window, for example)
and there can be extensions - extra abilities that make sense when editing
command lines (when it is known that is what is happening) that make less,
or no, sense when editing a file.

kre




Re: ctrl-w oddity on bash-4.4

2017-10-26 Thread Chet Ramey
On 10/26/17 5:20 PM, Robert Elz wrote:
> Date:Thu, 26 Oct 2017 10:04:54 -0400
> From:Chet Ramey 
> Message-ID:  
> 
>   | Posix doesn't make that distinction.
> 
> Actually, it does,
> 
>   Input Mode Commands in vi

Interesting. The description of vi editing mode under the description of
`sh' contains no corresponding text and no reference to the `vi'
description defining it.  There is no distinction made depending on when
characters were entered into the "current command line", and the
description of input mode says only:

"While in insert mode, any character typed shall be inserted in the current
command line, unless it is from the following set."

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: ctrl-w oddity on bash-4.4

2017-10-26 Thread Chet Ramey
On 10/26/17 5:30 PM, Robert Elz wrote:
> I should have also said that's there's no requirement (that I can see
> anyway) that vi mode in readline be the same as vi.

There isn't, and readline's vi mode doesn't attempt it. It only tries to
implement what Posix defines as part of the `sh' description, with some
additional commands as they behave in vi where they make sense.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Documentation issue

2017-10-26 Thread Clark Wang
On Fri, Oct 27, 2017 at 3:00 AM, Eli Barzilay  wrote:

> On Thu, Oct 26, 2017 at 2:02 PM, Chet Ramey  wrote:
> >
> > It's more of a general statement about arrays, though it appears in
> > the paragraph that discusses unset, so it's in the man page section on
> > arrays.  You have to be careful about putting the same information in
> > too many different places -- the man page is big enough already.
>
> I'm very aware of man page bloat and the fact that the bash page is very
> long as is.  But (a) I think that the builtin section for each command
> is the more important place for such things; and (b) what I'm suggesting
> is just a short reminder sentence about the need for quoting unsets for
> array elements.
>
> To make it more concrete, I think that the following change will be
> good:
>
> 1. Drop the current "Care must be taken ... the entire array." two
>sentences and replace them with some "See the unset builtin
>description below".
>

It's not only about unset. You also need to take care of other builtin
commands:

cd: cd a[dir] vs. cd 'a[dir]'
echo: echo foo[bar] vs. echo 'foo[bar]'; echo * vs. echo '*'
eval: eval a[i]=b vs. eval 'a[i]=b'
printf: printf foo[bar] vs. printf 'foo[bar]'
set: set a[b] vs. set 'a[b]'
source: source file[name] vs. source 'file[name]'
test: test a[b] vs. test 'a[b]'
...

and even for external commands:

find /the/dir -name *.txt


Re: Documentation issue

2017-10-26 Thread Eli Barzilay
On Thu, Oct 26, 2017 at 10:50 PM, Clark Wang  wrote:
> On Fri, Oct 27, 2017 at 3:00 AM, Eli Barzilay  wrote:
>>
>> 1. Drop the current "Care must be taken ... the entire array." two
>>sentences and replace them with some "See the unset builtin
>>description below".
>
> It's not only about unset. You also need to take care of other builtin
> commands: [...]

I already said why `unset` is different.  If it wasn't clear, a direct
example is the fact that `delete` in javascript is a special syntax
rather than a function.  To make it more confusing, the other obvious
place where an lvalue appears (left of a =), is special in bash for
other reasons, but it does make it easier to assume that `unset` is
special.

-- 
   ((x=>x(x))(x=>x(x)))  Eli Barzilay:
   http://barzilay.org/  Maze is Life!



Re: Documentation issue

2017-10-26 Thread Clark Wang
On Fri, Oct 27, 2017 at 1:17 PM, Eli Barzilay  wrote:

>
> I already said why `unset` is different.  If it wasn't clear, a direct
> example is the fact that `delete` in javascript is a special syntax
> rather than a function.  To make it more confusing, the other obvious
> place where an lvalue appears (left of a =), is special in bash for
> other reasons, but it does make it easier to assume that `unset` is
> special.
>

What `unset' does is special but there's nothing special when parsing the
command and bash even does not care if it's built-in command or not.