Re: Question on $@ vs $@$@

2024-08-27 Thread Chet Ramey

On 8/26/24 8:21 PM, Steffen Nurpmeso wrote:

Chet Ramey wrote in
  :
  |On 8/23/24 5:47 PM, Steffen Nurpmeso wrote:
  |>   If IFS has a value other than the default, then sequences of the
  |>   whitespace characters space, tab, and newline are ignored at the
  |>   beginning and end of the word, as long as the whitespace
  |>   character is in the value of IFS (an IFS  whitespace  charac‐
  |>   ter).


So an IFS whitespace character is one that is in the value of IFS.


  |>
  |> So IFS whitespace only if part of $IFS.
  |>
  |>   Any  character in IFS that is not IFS whitespace, along
  |>   with any adjacent IFS whitespace characters, delimits a field.
  |>
  |> So this "adjacent" even if *not* part of $IFS.
  |
  |I am genuinely curious how you concluded this, given the definition you
  |previously quoted.

It is only skipping ("trimming away") further data without further
delimiting if only IFS whitespace is seen.


The definition of IFS whitespace requires that the characters be part of
the value of IFS, so I'm wondering how you arrived at the "not part of
$IFS" above. Do you mean that IFS whitespace characters are the only ones
where multiple instances of those characters can delimit a single field?

If I can increase that section's clarity, I'm all for it. It's a confusing
topic.



  |>   A sequence of IFS whitespace characters is also treated as
  |>   a delimiter.
  |>
  |> So this means that *regardless* of whatever $IFS is, the three IFS
  |> whitespace characters are $IFS anyway *if* that is set to
  |> a nin-empty non-default value.
  |
  |Nonsense.

How you interpret this "also" if not so, that is the question.
My impression was that you had an eye on the standard text and
tried to vaporise it down to the core.  Very well.


You have to look at the definition of IFS whitespace.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: eval '<$(;)' causes Segmentation Fault

2024-08-27 Thread Chet Ramey

On 8/26/24 9:52 PM, Dale R. Worley wrote:

 writes:

Repeat-By:
 1. Create a script, i.e. `poc.sh` with the problematic string
 2. Execute `bash poc.sh`


Interestingly, when I run it (bash 5.1.0(1), which is pretty old), I
don't get the seg fault when I enter that string from the keyboard, only
when it's read from a script file.


https://lists.gnu.org/archive/html/bug-bash/2024-08/msg00205.html

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: eval '<$(;)' causes Segmentation Fault

2024-08-27 Thread Chet Ramey

On 8/26/24 6:44 PM, youheng@gmail.com wrote:


Bash Version: 5.3
Patch Level: 0
Release Status: alpha

POC:
$ cat poc.sh
eval '<${;}'


The specific case is an empty command containing only a redirection that 
results in an expansion error read from a script or string.

I can confirm that the error is triggerted in the "execute_null_command" 
function and later containing a redirection.
Specifically the variable `INPUT_STREAM bashinput.location` is both a char 
pointer and an int.


Thanks for the analysis.


At first it is used as a char pointer in the function "parse_and_execute"

BEFORE
```
gdb> p bash_input.location.string
$3 = 0x7fb3dc0db3b0 "<${;}"
```
However at shell.c:1758 in the fuction unset_bash_input it gets overwritten to 
a fd:
```
bash_input.location.buffered_fd = -1;


This is an effect of the problem, not the problem itself. The subshell
forked to execute the empty command should not go back and try to read
from the script again after it encounters an expansion error. The fix is
to provide a target for longjmp in the forked subshell.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Out of bounds read in parse.y.

2024-08-27 Thread Chet Ramey

On 8/27/24 12:41 AM, Collin Funk wrote:

Hi,

When compiling with undefined behavior sanitizer and then running:


Which version?



   $ ./bash
   parse.y:1000:93: runtime error: index -1 out of bounds for type 'int [257]'


Please send a reproducer.



The offending section of code:

case_command:   CASE WORD newline_list IN newline_list ESAC
{
  $$ = make_case_command ($2, (PATTERN_LIST *)NULL, 
word_lineno[word_top]);
  if (word_top >= 0) word_top--;
}
|   CASE WORD newline_list IN case_clause_sequence newline_list ESAC
{
   /* Access of word_lineno[word_top] causes bad read.  
*/
  $$ = make_case_command ($2, $5, 
word_lineno[word_top]);
  if (word_top >= 0) word_top--;
}

And the definition of word top and word_lineno:

#define MAX_COMPOUND_NEST   256
static int word_lineno[MAX_COMPOUND_NEST+1];
static int word_top = -1;

The value of word_top appears to only be set in 'set_word_top':

static inline int
set_word_top (int t)
{
   switch (t)
 {
 case CASE:
 case SELECT:
 case FOR:
 case IF:
 case WHILE:
 case UNTIL:
   if (word_top < MAX_COMPOUND_NEST)
word_top++;
   word_lineno[word_top] = line_number;
   break;
 default:
   break;
 }
   return word_top;
}

Shouldn't all the decrements of word_top be protected by:

 if (word_top > 0) word_top--;

instead of:

 if (word_top >= 0) word_top--;


Why? 0 is a valid index. set_word_top increments word_top before assigning
to word_lineno[word_top].


Or is there something more complicated that I am missing here?


I suspect there is a decrement that isn't matched by a call to
set_word_top(). But a reproducer would help, otherwise we're all just
guessing.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: eval '<$(;)' causes Segmentation Fault

2024-08-27 Thread Chet Ramey

On 8/26/24 10:57 PM, Zachary Santer wrote:


Bash Version: 5.3
Patch Level: 0
Release Status: alpha

This is devel, commit 2e01122fe7.

Really don't get what's going on here:


You have two instances of the shell fighting over terminal input.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Out of bounds read in parse.y.

2024-08-27 Thread Collin Funk
Hi Chet,

Chet Ramey  writes:

> Which version?

This was from bash devel branch, commit hash 
2e01122fe78eb5a42c9b9f3ca46b91f895959675.

Built with:

   ./configure CFLAGS='-fsanitize=undefined'

> Why? 0 is a valid index. set_word_top increments word_top before assigning
> to word_lineno[word_top].

Ah, okay. I see what you mean.

> I suspect there is a decrement that isn't matched by a call to
> set_word_top(). But a reproducer would help, otherwise we're all just
> guessing.

Sure, the bad read was happening while reading my .profile and .bashrc
file. I've narrowed it down to a bash completion file installed by my
system packages. I've attached it to this message.

Running:

./bash --norc ovs-vsctl-bashcomp.bash

triggers the out of bounds read.

SAVE_IFS=$IFS
IFS="
"
_OVSDB_SERVER_LOCATION=""

# Run ovs-vsctl and make sure that ovs-vsctl is always called with
# the correct --db argument.
_ovs_vsctl () {
local _db

if [ -n "$_OVSDB_SERVER_LOCATION" ]; then
_db="--db=$_OVSDB_SERVER_LOCATION"
fi
ovs-vsctl ${_db} "$@"
}

# ovs-vsctl --commands outputs in this format:
#
# main = ,,
# localopts = ([] )*
# localopt = --[^]]*
# name = [^,]*
# arguments = ((!argument|?argument|*argument|+argument) )*
# argument = ([^ ]*|argument\|argument)
#
# The [] characters in local options are just delimiters.  The
# argument prefixes mean:
#   !argument :: The argument is required
#   ?argument :: The argument is optional
#   *argument :: The argument may appear any number (0 or more) times
#   +argument :: The argument may appear one or more times
# A bar (|) character in an argument means thing before bar OR thing
# after bar; for example, del-port can take a port or an interface.

_OVS_VSCTL_COMMANDS="$(_ovs_vsctl --commands)"

# This doesn't complete on short arguments, so it filters them out.
_OVS_VSCTL_OPTIONS="$(_ovs_vsctl --options | awk '/^--/ { print $0 }' \
  | sed -e 's/\(.*\)=ARG/\1=/')"
IFS=$SAVE_IFS

declare -A _OVS_VSCTL_PARSED_ARGS
declare -A _OVS_VSCTL_NEW_RECORDS

# This is a convenience function to make sure that user input is
# looked at as a fixed string when being compared to something.  $1 is
# the input; this behaves like 'grep "^$1"' but deals with regex
# metacharacters in $1.
_ovs_vsctl_check_startswith_string () {
awk 'thearg == "" || index($0, thearg)==1' thearg="$1"
}

# $1 = word to complete on.
# Complete on global options.
_ovs_vsctl_bashcomp_globalopt () {
local options result

options=""
result=$(printf "%s\n" "${_OVS_VSCTL_OPTIONS}" \
 | _ovs_vsctl_check_startswith_string "${1%=*}")
if [[ $result =~ "=" ]]; then
options="NOSPACE"
fi
printf -- "${options}\nEO\n${result}"
}

# $1 = word to complete on.
# Complete on local options.
_ovs_vsctl_bashcomp_localopt () {
local options result possible_opts

possible_opts=$(printf "%s\n" "${_OVS_VSCTL_COMMANDS}" | cut -f1 -d',')
# This finds all options that could go together with the
# already-seen ones
for prefix_arg in $1; do
possible_opts=$(printf "%s\n" "$possible_opts" \
| grep -- "\[${prefix_arg%%=*}=\?\]")
done
result=$(printf "%s\n" "${possible_opts}" \
 | tr ' ' '\n' | tr -s '\n' | sort | uniq)
# This removes the already-seen options from the list so that
# users aren't completed for the same option twice.
for prefix_arg in $1; do
result=$(printf "%s\n" "${result}" \
 | grep -v -- "\[${prefix_arg%%=*}=\?\]")
done
result=$(printf "%s\n" "${result}" | sed -ne 's/\[\(.*\)\]/\1/p' \
 | _ovs_vsctl_check_startswith_string "$2")
if [[ $result =~ "=" ]]; then
options="NOSPACE"
fi
printf -- "${options}\nEO\n${result}"
}

# $1 = given local options.
# $2 = word to complete on.
# Complete on command that could contain the given local options.
_ovs_vsctl_bashcomp_command () {
local result possible_cmds

possible_cmds=$(printf "%s\n" "${_OVS_VSCTL_COMMANDS}")
for prefix_arg in $1; do
possible_cmds=$(printf "%s\n" "$possible_cmds" \
| grep -- "\[$prefix_arg=\?\]")
done
result=$(printf "%s\n" "${possible_cmds}" \
 | cut -f2 -d',' \
 | _ovs_vsctl_check_startswith_string "$2")
printf -- "${result}"
}

# $1 = completion result to check.
# Return 0 if the completion result is non-empty, otherwise return 1.
_ovs_vsctl_detect_nonzero_completions () {
local tmp newarg

newarg=${1#*EO}
readarray tmp <<< "$newarg"
if [ "${#tmp[@]}" -eq 1 ] && [ "${#newarg}" -eq 0 ]; then
return 1
fi
return 0
}

# $1 = argument format to expand.
# Expand '+ARGUMENT' in argument format to '!ARGUMENT *ARGUMENT'.
_ovs_vsctl_expand_command () {
result=$(printf "%s\n" "${_OVS_VSCTL_COMMANDS}" \
 | grep -- ",$1," | cut -f3 -d',' | tr ' ' '\n' \
 | awk '/\+.*/ { name=substr($0,2);
  

Re: Question on $@ vs $@$@

2024-08-27 Thread Steffen Nurpmeso
Chet Ramey wrote in
 :
 |On 8/26/24 8:21 PM, Steffen Nurpmeso wrote:
 |> Chet Ramey wrote in
 |>   :
 |>|On 8/23/24 5:47 PM, Steffen Nurpmeso wrote:
 |>|>   If IFS has a value other than the default, then sequences of the
 |>|>   whitespace characters space, tab, and newline are ignored at the
 |>|>   beginning and end of the word, as long as the whitespace
 |>|>   character is in the value of IFS (an IFS  whitespace  charac‐
 |>|>   ter).
 |
 |So an IFS whitespace character is one that is in the value of IFS.
 |
 |>|>
 |>|> So IFS whitespace only if part of $IFS.
 |>|>
 |>|>   Any  character in IFS that is not IFS whitespace, along
 |>|>   with any adjacent IFS whitespace characters, delimits a field.
 |>|>
 |>|> So this "adjacent" even if *not* part of $IFS.
 |>|
 |>|I am genuinely curious how you concluded this, given the definition you
 |>|previously quoted.
 |> 
 |> It is only skipping ("trimming away") further data without further
 |> delimiting if only IFS whitespace is seen.
 |
 |The definition of IFS whitespace requires that the characters be part of
 |the value of IFS, so I'm wondering how you arrived at the "not part of
 |$IFS" above. Do you mean that IFS whitespace characters are the only ones

Hm, it is likely the doubling that confuses (some including) me.
It is redundant unless it would read

  Any character in IFS delimits a field, adjacent IFS whitespace
  characters are then ignored.

or so.  (Which i hope -- i have not yet started "working" here --
is what actually happens.)

 |where multiple instances of those characters can delimit a single field?

To the contrary i would now say that with a non-default non-empty
$IFS only IFS characters that are not also IFS whitespace create
empty fields.

 |If I can increase that section's clarity, I'm all for it. It's a confusing
 |topic.

Yes, that is true.  (But nothing beats the standard wording in
that respect.)

 |>|>   A sequence of IFS whitespace characters is also treated as
 |>|>   a delimiter.
 |>|>
 |>|> So this means that *regardless* of whatever $IFS is, the three IFS
 |>|> whitespace characters are $IFS anyway *if* that is set to
 |>|> a nin-empty non-default value.
 |>|
 |>|Nonsense.
 |> 
 |> How you interpret this "also" if not so, that is the question.
 |> My impression was that you had an eye on the standard text and
 |> tried to vaporise it down to the core.  Very well.
 |
 |You have to look at the definition of IFS whitespace.

Already forgotten at that point.

 --End of 

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: Question on $@ vs $@$@

2024-08-27 Thread Robert Elz
Date:Tue, 27 Aug 2024 22:02:34 +0200
From:Steffen Nurpmeso 
Message-ID:  <20240827200234.95X76_wN@steffen%sdaoden.eu>


  |   Any character in IFS delimits a field, adjacent IFS whitespace
  |   characters are then ignored.

Not quite.   Any sequence of any amount of IFS whitespace, and no
more than one other IFS character delimits a field.

  | To the contrary i would now say that with a non-default non-empty
  | $IFS only IFS characters that are not also IFS whitespace create
  | empty fields.

The preconditions there are irrelevant.
Only IFS chars that are not IFS whitespace create empty fields.

Where IFS is empty, there are no IFS chars, hence nothing delimits
a field.   When IFS is unset (default case) it consists of $' \t\n'
(there is no difference whatever, for field splitting, or "$*" for
that matter) between an unset IFS and IFS=$' \t\n'

  |  |You have to look at the definition of IFS whitespace.
  | Already forgotten at that point.

Unfortunately, when reading the standard, you cannot forget a
single word of it.  Everything (normative) applies.

kre




bash passes changed termios to backgrounded process(es) groups?

2024-08-27 Thread Steffen Nurpmeso
Hello.

I got a bug report for my mailer which stated

   $ ( echo blah | Mail root ) &
  [1] 2754649
   $ ^M^M^M^M^C^C

  [1]+  Stopped ( echo blah | Mail root )
   $ fg
  ( echo blah | Mail root )
   $

I turns out i answered him now

  The thing is that if i apply the patch (this to [master])

diff --git a/src/mx/termios.c b/src/mx/termios.c
index 733974ebce..08dd045226 100644
--- a/src/mx/termios.c
+++ b/src/mx/termios.c
@@ -152,6 +152,8 @@ a_termios_norm_query(void){
  &a_termios_g.tiosg_normal->tiose_state) == 0);
/* XXX always set ECHO and ICANON in our "normal" canonical state */
a_termios_g.tiosg_normal->tiose_state.c_lflag |= ECHO | ICANON;
+   a_termios_g.tiosg_normal->tiose_state.c_iflag |= ICRNL;
+
/*NYD2_OU;*/
return rv;
 }

  then everything is working as should in an otherwise unchanged MUA.
  It seems readline does this:

./lib/sh/shtty.c:  ttp->c_iflag |= ICRNL;   /* make sure we get CR->NL 
on input */
./lib/readline/rltty.c:  tiop->c_iflag &= ~(ICRNL | INLCR);

..and it seems that if bash starts a normal process then ICRNL is
set, but if it starts a (process)& or only process&, then not!
(I was about to send this to bug-readline first.)

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: bash passes changed termios to backgrounded process(es) groups?

2024-08-27 Thread Steffen Nurpmeso
Steffen Nurpmeso wrote in
 <20240827234659.1xfh6CZb@steffen%sdaoden.eu>:
 ...
 |..and it seems that if bash starts a normal process then ICRNL is
 |set, but if it starts a (process)& or only process&, then not!

Yeah, and it seems to me it should not, since programs have to
fetch the terminal defaults in order to be able to properly
restore them, and unfortunately we restored without ICRNL thus.
I consider to do

  ./lib/sh/shtty.c:  ttp->c_iflag |= ICRNL;   /* make sure we get CR->NL on 
input */

myself always, but of course then *i* as a secondary (at best)
process modify the terminal settings in a way others could fail to
correct.  (I mean, surely, a job control shell will fix that, but
it is not nice, anyway.)

Ciao!

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: Question on $@ vs $@$@

2024-08-27 Thread Steffen Nurpmeso
Robert Elz wrote in
 <14146.1724799...@jacaranda.noi.kre.to>:
 |Date:Tue, 27 Aug 2024 22:02:34 +0200
 |From:Steffen Nurpmeso 
 |Message-ID:  <20240827200234.95X76_wN@steffen%sdaoden.eu>
 |
 |
 ||   Any character in IFS delimits a field, adjacent IFS whitespace
 ||   characters are then ignored.
 |
 |Not quite.   Any sequence of any amount of IFS whitespace, and no
 |more than one other IFS character delimits a field.

That confuses me again, unfortunately i got a bug report and
distracted.  I mean, i would

1. skip leading whitespace anyhow (IFS or not, which
   is a "documented bug" here i would say),
   for the shell this would be: leading IFS whitespace,

2. pass by none-to-many non-IFS bytes, the "field data", then

3.
   a. if there is a non-IFS-whitespace character:
  - delimit the field, even with empty "field data",

   b. if there is a IFS-whitespace character:
  - delimit the field only with non-empty "field data",

4. skip trailing (new leading) (IFS-) whitespace

 || To the contrary i would now say that with a non-default non-empty
 || $IFS only IFS characters that are not also IFS whitespace create
 || empty fields.
 |
 |The preconditions there are irrelevant.
 |Only IFS chars that are not IFS whitespace create empty fields.

Yes.

 |Where IFS is empty, there are no IFS chars, hence nothing delimits
 |a field.   When IFS is unset (default case) it consists of $' \t\n'
 |(there is no difference whatever, for field splitting, or "$*" for
 |that matter) between an unset IFS and IFS=$' \t\n'
 |
 |||You have to look at the definition of IFS whitespace.
 || Already forgotten at that point.
 |
 |Unfortunately, when reading the standard, you cannot forget a
 |single word of it.  Everything (normative) applies.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)