Extrem memory consumption during unicode test with alt-array implementation

2022-10-06 Thread Dr. Werner Fink
Hi,

Just to mention due to the extrem memory consumption during unicode tests
with enabled altenative array implementation the speed win is more then
equalised.  That is the build system becomes unusable

ps aux | grep -E 'USER|^399'

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
[...]
399   6641 24.9 87.4 14690520 7073924 pts/0 R+  13:37   0:18 
/home/abuild/rpmbuild/BUILD/bash-5.2/bash ./unicode1.sub

Now building without --enable-alt-array-implementation


Werner

-- 
  "Having a smoking section in a restaurant is like having
  a peeing section in a swimming pool." -- Edward Burr


signature.asc
Description: PGP signature


Re: Extrem memory consumption during unicode test with alt-array implementation

2022-10-06 Thread Chet Ramey

On 10/6/22 8:11 AM, Dr. Werner Fink wrote:

Hi,

Just to mention due to the extrem memory consumption during unicode tests
with enabled altenative array implementation the speed win is more then
equalised.  That is the build system becomes unusable


The unicode test allocates a sparse array with a max index of 1879048270.
The standard implementation handles that just fine. The alternate
implementation just tries to allocate an array and exceeds the data size
limit long before it gets to the max. Depending on your resource limits
and your VM system, the system will keep grinding away trying to satisfy
those malloc requests.

That's the tradeoff: space vs speed.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: declare -F incorrect line number

2022-10-06 Thread Chet Ramey

On 10/5/22 6:29 PM, Robert Elz wrote:

 Date:Wed, 5 Oct 2022 15:36:55 -0400
 From:Chet Ramey 
 Message-ID:  <3d89acac-4c0a-64c9-e22c-1a3ca6860...@case.edu>

   | Other than that, there's no advantage.

There can be.   I have, on occasion (not in bash - I don't
write bash scripts) had a need to redefine one of the standard
commands, while executing a particular function (which calls other
more standard functions which run the command) - and define the
same command differently when running a different function, which
runs the same standard functions running the command, but in a
different way.


Sure, that's the conditional definition I talked about in my first reply.
The OP indicated that that wasn't his goal.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




bracket needs to be escaped in variable substitution?

2022-10-06 Thread Antoine

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2
uname output: Linux z390 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 
(2022-07-23) x86_64 GNU/Linux

Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.2
Patch Level: 0
Release Status: release

Description:

Hello,

When using substitution with variables and opening bracket as string 
"target", the bracket has to be escaped if there is no closing bracket 
and only when a variable is already used in the string.


It wasn't the behavior or previous bash version.

Repeat-By:

$ ./bash --norc
bash-5.2$ var1="qwertyuiop asdfghjkl"
bash-5.2$ var2="_"
bash-5.2$ echo "${var1// /${var2}[${var2}}"
bash: bad substitution: no closing `}' in ${var1// /${var2}[${var2}}

# but the following work as expected (escaped bracket):
bash-5.2$ echo "${var1// /${var2}\[${var2}}"
qwertyuiop_[_asdfghjkl

# also working as expected (with closing bracket):
bash-5.2$ echo "${var1// /${var2}[${var2}]}"
qwertyuiop_[_]asdfghjkl

# also working as expected (no variable before the bracket):
bash-5.2$ echo "${var1// /[${var2}}"
qwertyuiop[_asdfghjkl


--
Antoine



Re: declare -F incorrect line number

2022-10-06 Thread Martin D Kealey
I write nested functions quite often, usually with a subsequent `unset -f`
but sometimes (necessarily) without.

Being able to write `local -F funcname { ... }` or `function -L funcname {
... }` would be a nice replacement for the former, but the latter is
usually about different phases of execution, rather than abstractions for
different data.

For example, when writing an explicit shift-reduce parser as part of a tab
completion function, the logic for "match a token" depends on both the kind
of token expected, whether or not we've arrived at the word that's the
target of the tab expansion, and how many times the tab key has been
pressed. Being able to redefine the various "match a token" functions by
calling a single function makes for a much less cluttered grammar
definition.

It would be helpful for the caller function to return the correct function
name and source filename and line, though I can see the latter being
somewhat tricky if it's created inside an eval.

-Martin

On Thu, 6 Oct 2022 at 08:29, Robert Elz  wrote:

> Date:Wed, 5 Oct 2022 15:36:55 -0400
> From:Chet Ramey 
> Message-ID:  <3d89acac-4c0a-64c9-e22c-1a3ca6860...@case.edu>
>
>   | Other than that, there's no advantage.
>
> There can be.   I have, on occasion (not in bash - I don't
> write bash scripts) had a need to redefine one of the standard
> commands, while executing a particular function (which calls other
> more standard functions which run the command) - and define the
> same command differently when running a different function, which
> runs the same standard functions running the command, but in a
> different way.
>
> Kind of like
>
> f1() {
> diff() { command diff -u "$@"; }
> dostuff
> unset -f diff
> }
>
> f2() {
> diff() { command diff -iw -c "$@"; }
> dostuff
> unset -f diff
> }
>
> where dostuff() does what ever is needed to make "newversion",
> and then, somewhere does one (or more) of something like
>
> diff origfile newversion
>
> "dostuff" can also just be run to get the default diff format.
>
> or something like that.   Real examples tend to be far more complicated
> (this simple case could be done just by having DIFFARGS or something, but
> that would mean modifying dostuff() to use that as diff $DIFFARGS )
>
> kre
>
>
>


Re: declare -F incorrect line number

2022-10-06 Thread Greg Wooledge
On Fri, Oct 07, 2022 at 01:28:59AM +1000, Martin D Kealey wrote:
> I write nested functions quite often, usually with a subsequent `unset -f`
> but sometimes (necessarily) without.
> 
> Being able to write `local -F funcname { ... }` or `function -L funcname {
> ... }` would be a nice replacement for the former, but the latter is
> usually about different phases of execution, rather than abstractions for
> different data.

You do realize that there are no "nested functions" in bash, right?  All
functions exist in a single, global function namespace.

unicorn:~$ bash
unicorn:~$ f() { g() { echo I am g; }; }
unicorn:~$ f
unicorn:~$ type g
g is a function
g () 
{ 
echo I am g
}

Functions are never "local".



Re: extglob can be erroneously enabled in bash-5.2 through comsub nesting

2022-10-06 Thread Chet Ramey

On 10/2/22 4:51 AM, Kerin Millar wrote:


$ declare -p BASH_VERSION
declare -- BASH_VERSION="5.2.0(1)-release"
$ BASH_COMPAT=50; shopt extglob; : $(: $(: $(:))); shopt extglob
extglob off
extglob on


Thanks for the report. I've attached the patch I applied to fix this.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
*** ../bash-5.2-patched/parse.y 2022-08-31 11:47:03.0 -0400
--- parse.y 2022-10-05 11:55:18.0 -0400
***
*** 4230,4234 
sh_parser_state_t ps;
sh_input_line_state_t ls;
!   int orig_ind, nc, sflags, start_lineno;
char *ret, *ep, *ostring;
  
--- 4230,4234 
sh_parser_state_t ps;
sh_input_line_state_t ls;
!   int orig_ind, nc, sflags, start_lineno, local_extglob;
char *ret, *ep, *ostring;
  
***
*** 4273,4277 
expand_aliases = 0;
  #if defined (EXTENDED_GLOB)
!   global_extglob = extended_glob; /* for reset_parser() */
  #endif
  
--- 4273,4277 
expand_aliases = 0;
  #if defined (EXTENDED_GLOB)
!   local_extglob = global_extglob = extended_glob; /* for reset_parser() */
  #endif
  
***
*** 4291,4294 
--- 4291,4297 
restore_parser_state (&ps);
  
+ #if defined (EXTENDED_GLOB)
+   extended_glob = local_extglob;
+ #endif
token_to_read = 0;
  


feature request: new builtin `defer`, scope delayed eval

2022-10-06 Thread Cynthia Coan
Hey all,

I've started working on this idea, and before getting too far I'd like
to get general feedback on the feature before going too far. I'd
specifically like to propose a new built-in called `defer` which acts
like `eval` however is not parsed/expanded/run until it's scope is
leaving. Hopefully "scope" is the correct word, I'm imagining it
running at the same time a local would go out of "scope" and be no
longer available (just before the locals are cleared, so locals can
still be used in expansion). The main purpose of defer is to help with
resource management, and more specifically cleanup.

Today cleaning up resources in scripts whether they be files,
virtual-machines/containers, or even global state can be challenging
for a variety of reasons. It can be very easy to leave extra
state/processes running that you may not mean to. Let's take a look
first at handling cleanup while "error mode", aka `set -e` is on
(we'll cover error mode being off later below, but we'll start with
error mode on. Not only because defer works better here, but also
because I think many scripts I write want error mode on as manually
checking every command for failure can be tedious.). Today there
exists four main ways of handling errors with error mode being on:

  1. Introduce another function that "wraps" the previous one, and is
capable of cleaning up resources. Then hoping no one calls the
internal one, maybe even by giving it a scary name like:
`__do_not_use_this_unless_you_want_to_do_cleanup_manually_which_you_better_internal_fn_name()`.
  2. Push responsibility onto the caller of the function, by having
users manually needing to call a cleanup function afterwards. Meaning
just calling: `my_function` is incorrect, and callers need to write:
`my_function || { cleanup_function; return 1; }`.
  3. Don't add complexity to the caller/wrap in a function, but push
complexity onto the author of the function itself by manually adding
`|| { cleanup; return 1; }` after every command in the function.
  4. Don't attempt to clean up the resource at all.

If #4 isn't a viable option, or it is but you'd just prefer not to do
it, you're left with three options that each either add significant
cognitive complexity, or the chance for misuse (or both!). This is
where defer comes in, solving the issue of "cleanup" without actually
introducing the chance for missing a cleanup through misuse. A very
over-simplified, contrived example is below:

  ```
  #!/usr/bin/env bash
  set -eo pipefail

  my_function() {
local -r tmp_dir=$(mktemp -d)
defer rm -r "${tmp_dir}"

value=$(command-that-could-fail --save-state "${tmp_dir}/state")
if [ "$value" = "success" ]; then
  could-fail-two --input "$(< ${tmp_dir}/state)"
  could-fail-three | pipe
  echo "commands succeeded"
else
  echo "critical failure exiting entire process"
  exit 1
fi
return 0
  }
  ```

In this case no matter how this function exits where there's a problem
with a pipe, a command failing, exiting the entire process, or a
simple return out successfully; The resource is guaranteed to be
cleaned up, assuming rm doesn't fail -- if it did it would clobber the
return code to 1 in this case, even on a return of 0.

If your script is running with error mode off on purpose, the benefits
drastically fall down to just potential easier readability. Rather
then needing to create a cleanup function where validation that
cleanup is correct, you can co-locate cleanup with the creation of
each item. This could make it very easy to validate multi-step
cleanups. No longer do you have to open the cleanup function, and the
regular function side by side to validate correctness. Take for
example the error mode case I mentioned earlier in this paragraph:

  ```
  #!/usr/bin/env bash

  scoped_error_mode() {
if ! echo -n "$SHELLOPTS" | grep 'errexit' >/dev/null 2>&1; then
  echo "error mode off, enabling for this function"
  set -e
  defer set +e
fi
if ! echo -n "$SHELLOPTS" | grep 'pipefail' >/dev/null 2>&1; then
  echo "pipefail off, enabling for this function"
  set -o pipefail
  defer set +o pipefail
fi

my_commands
my_other_commands | piped-to
  }
  ```

Here not only can we scope normally global states to a single function
(allowing us to user error mode just where it might be useful, and not
everywhere), but as you can see the defer's are directly next to where
they are created which means we don't have to save to variables
whether or not we need to "turn things back off" again. This at least
for most people I think makes it significantly easier to read.

The help for the built-in I've been working on looks like:

```
defer: defer [-l] or defer [-d offset] or defer [arg ...]
Execute arguments as a shell command when the current scope exists.

Queue up a statement to be eval'd when a scope is left. Runs directly before
locals in the same scope get cleared. Deferred statements are run in a last

Re: feature request: new builtin `defer`, scope delayed eval

2022-10-06 Thread Lawrence Velázquez
On Thu, Oct 6, 2022, at 4:08 PM, Cynthia Coan wrote:
> I'd specifically like to propose a new built-in called `defer` which
> acts like `eval` however is not parsed/expanded/run until it's scope
> is leaving.  Hopefully "scope" is the correct word, I'm imagining it
> running at the same time a local would go out of "scope" and be no
> longer available (just before the locals are cleared, so locals can
> still be used in expansion).

I think it would be more natural to implement function-local RETURN
and EXIT traps than introduce a second command that looks like
'trap' and quacks like 'trap' but is actually not 'trap'.  This
could be done generically by adding the ability to "scope" traps
to functions (possibly via a new option to 'trap' or a shopt à la
zsh's LOCAL_TRAPS) or specifically by creating "local" variants of
RETURN and EXIT.  Usage might look like this:

f() {
# new option
trap -f 'cleaning up' EXIT RETURN
cmd1
cmd2
cmd3
}

or this:

g() {
# new traps
trap 'cleaning up' EXIT_LOCAL RETURN_LOCAL
cmdA
cmdB
cmdC
}

-- 
vq



Re: bracket needs to be escaped in variable substitution?

2022-10-06 Thread Antoine
Issue is not reproduced when using a variable as pattern, and it's not 
related the space character in the pattern:


$ ./bash --norc
bash-5.2$ var="abcd efgh ijkl mnop qrst"
bash-5.2$ pattern=" "
bash-5.2$ string="_"

bash-5.2$ echo "${var//${pattern}/${string}[${string}}"
abcd_[_efgh_[_ijkl_[_mnop_[_qrst

bash-5.2$ echo "${var// /${string}[${string}}"
bash: bad substitution: no closing `}' in ${var// /${string}[${string}}

bash-5.2$ echo "${var//a/${string}[${string}}"
bash: bad substitution: no closing `}' in ${var//a/${string}[${string}}


On 06/10/2022 16:52, Antoine wrote:

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2
uname output: Linux z390 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 
(2022-07-23) x86_64 GNU/Linux

Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.2
Patch Level: 0
Release Status: release

Description:

Hello,

When using substitution with variables and opening bracket as string 
"target", the bracket has to be escaped if there is no closing bracket 
and only when a variable is already used in the string.


It wasn't the behavior or previous bash version.

Repeat-By:

$ ./bash --norc
bash-5.2$ var1="qwertyuiop asdfghjkl"
bash-5.2$ var2="_"
bash-5.2$ echo "${var1// /${var2}[${var2}}"
bash: bad substitution: no closing `}' in ${var1// /${var2}[${var2}}

# but the following work as expected (escaped bracket):
bash-5.2$ echo "${var1// /${var2}\[${var2}}"
qwertyuiop_[_asdfghjkl

# also working as expected (with closing bracket):
bash-5.2$ echo "${var1// /${var2}[${var2}]}"
qwertyuiop_[_]asdfghjkl

# also working as expected (no variable before the bracket):
bash-5.2$ echo "${var1// /[${var2}}"
qwertyuiop[_asdfghjkl


--
Antoine





Re: feature request: new builtin `defer`, scope delayed eval

2022-10-06 Thread Cynthia Coan
I think that's certainly a fair option, and a potential solution. The
reason for introducing a new "builtin" as opposed to utilizing a trap
is because safely appending to a trap can be filled with holes. Since
trap always overwrites what is in the trap, you have to be aware of
what is already in the trap, and ensure you are properly appending to
it (and that a previous error in the trap processing doesn't affect
you). This removes the nice benefit of multi-step setups, so for
example if we rewrite the scoped error mode with traps we get:

  ```
  scoped_error_mode() {
if ! echo -n "$SHELLOPTS" | grep 'errexit' >/dev/null 2>&1; then
  echo "error mode off, enabling for this function"
  set -e
  trap "set +e" EXIT_LOCAL RETURN_LOCAL
fi
if ! echo -n "$SHELLOPTS" | grep 'pipefail' >/dev/null 2>&1; then
  echo "pipefail off, enabling for this function"
  set -o pipefail
  if [ "x$(trap -p EXIT_LOCAL)" != "x" ]; then
 trap "$(trap -p EXIT_LOCAL) ; set +o pipefail" # if dealing
with quotes have to sed them out
  fi
  if [ "x$(trap -p RETURN_LOCAL)" != "x" ]; then
 trap "$(trap -p RETURN_LOCAL); set +o pipefail"
  fi
   fi

my_commands
my_other_commands | piped-to
  }
  ```

This isn't terrible by any means, and is "more in line" with the
existing practices. I still think defer might be more simple, but that
is just my opinion! I think both are totally workable.

- Cynthia

On Thu, Oct 6, 2022 at 4:05 PM Lawrence Velázquez  wrote:
>
> On Thu, Oct 6, 2022, at 4:08 PM, Cynthia Coan wrote:
> > I'd specifically like to propose a new built-in called `defer` which
> > acts like `eval` however is not parsed/expanded/run until it's scope
> > is leaving.  Hopefully "scope" is the correct word, I'm imagining it
> > running at the same time a local would go out of "scope" and be no
> > longer available (just before the locals are cleared, so locals can
> > still be used in expansion).
>
> I think it would be more natural to implement function-local RETURN
> and EXIT traps than introduce a second command that looks like
> 'trap' and quacks like 'trap' but is actually not 'trap'.  This
> could be done generically by adding the ability to "scope" traps
> to functions (possibly via a new option to 'trap' or a shopt à la
> zsh's LOCAL_TRAPS) or specifically by creating "local" variants of
> RETURN and EXIT.  Usage might look like this:
>
> f() {
> # new option
> trap -f 'cleaning up' EXIT RETURN
> cmd1
> cmd2
> cmd3
> }
>
> or this:
>
> g() {
> # new traps
> trap 'cleaning up' EXIT_LOCAL RETURN_LOCAL
> cmdA
> cmdB
> cmdC
> }
>
> --
> vq



Re: extglob can be erroneously enabled in bash-5.2 through comsub nesting

2022-10-06 Thread Kerin Millar
On Thu, 6 Oct 2022 15:49:26 -0400
Chet Ramey  wrote:

> On 10/2/22 4:51 AM, Kerin Millar wrote:
> 
> > $ declare -p BASH_VERSION
> > declare -- BASH_VERSION="5.2.0(1)-release"
> > $ BASH_COMPAT=50; shopt extglob; : $(: $(: $(:))); shopt extglob
> > extglob off
> > extglob on
> 
> Thanks for the report. I've attached the patch I applied to fix this.

Thanks for the patch. It is probably sufficient for the downstream bug report 
to be closed. Unfortunately, it remains the case that the >=5.2-rc3 parser is 
buggy. Consider the following, as conducted using 5.2.2 with said patch applied.

$ declare -p BASH_VERSION
declare -- BASH_VERSION="5.2.2(1)-release"
$ BASH_COMPAT=50
$ [[ foo = $(: $(shopt extglob >&2)) ]]
extglob off
$ shopt extglob
extglob off
$ [[ foo = $(: $(shopt extglob >&2) ]]
> ^C
$ shopt extglob
extglob on

Note that, within the second test, the comsub parentheses are deliberately 
imbalanced, causing bash to display the PS2 prompt and wait for further input. 
I then interrupt bash with ^C and check on the status of the extglob option, 
only to find that - yet again - it has been unexpectedly enabled.

This is perfectly reproducible, provided that those exact steps are carried out 
as shown. In particular, one cannot skip the first (syntactically correct) 
test, which must nest at least one comsub within another.

-- 
Kerin Millar