Extrem memory consumption during unicode test with alt-array implementation
Hi, Just to mention due to the extrem memory consumption during unicode tests with enabled altenative array implementation the speed win is more then equalised. That is the build system becomes unusable ps aux | grep -E 'USER|^399' USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND [...] 399 6641 24.9 87.4 14690520 7073924 pts/0 R+ 13:37 0:18 /home/abuild/rpmbuild/BUILD/bash-5.2/bash ./unicode1.sub Now building without --enable-alt-array-implementation Werner -- "Having a smoking section in a restaurant is like having a peeing section in a swimming pool." -- Edward Burr signature.asc Description: PGP signature
Re: Extrem memory consumption during unicode test with alt-array implementation
On 10/6/22 8:11 AM, Dr. Werner Fink wrote: Hi, Just to mention due to the extrem memory consumption during unicode tests with enabled altenative array implementation the speed win is more then equalised. That is the build system becomes unusable The unicode test allocates a sparse array with a max index of 1879048270. The standard implementation handles that just fine. The alternate implementation just tries to allocate an array and exceeds the data size limit long before it gets to the max. Depending on your resource limits and your VM system, the system will keep grinding away trying to satisfy those malloc requests. That's the tradeoff: space vs speed. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: declare -F incorrect line number
On 10/5/22 6:29 PM, Robert Elz wrote: Date:Wed, 5 Oct 2022 15:36:55 -0400 From:Chet Ramey Message-ID: <3d89acac-4c0a-64c9-e22c-1a3ca6860...@case.edu> | Other than that, there's no advantage. There can be. I have, on occasion (not in bash - I don't write bash scripts) had a need to redefine one of the standard commands, while executing a particular function (which calls other more standard functions which run the command) - and define the same command differently when running a different function, which runs the same standard functions running the command, but in a different way. Sure, that's the conditional definition I talked about in my first reply. The OP indicated that that wasn't his goal. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
bracket needs to be escaped in variable substitution?
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 uname output: Linux z390 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 (2022-07-23) x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.2 Patch Level: 0 Release Status: release Description: Hello, When using substitution with variables and opening bracket as string "target", the bracket has to be escaped if there is no closing bracket and only when a variable is already used in the string. It wasn't the behavior or previous bash version. Repeat-By: $ ./bash --norc bash-5.2$ var1="qwertyuiop asdfghjkl" bash-5.2$ var2="_" bash-5.2$ echo "${var1// /${var2}[${var2}}" bash: bad substitution: no closing `}' in ${var1// /${var2}[${var2}} # but the following work as expected (escaped bracket): bash-5.2$ echo "${var1// /${var2}\[${var2}}" qwertyuiop_[_asdfghjkl # also working as expected (with closing bracket): bash-5.2$ echo "${var1// /${var2}[${var2}]}" qwertyuiop_[_]asdfghjkl # also working as expected (no variable before the bracket): bash-5.2$ echo "${var1// /[${var2}}" qwertyuiop[_asdfghjkl -- Antoine
Re: declare -F incorrect line number
I write nested functions quite often, usually with a subsequent `unset -f` but sometimes (necessarily) without. Being able to write `local -F funcname { ... }` or `function -L funcname { ... }` would be a nice replacement for the former, but the latter is usually about different phases of execution, rather than abstractions for different data. For example, when writing an explicit shift-reduce parser as part of a tab completion function, the logic for "match a token" depends on both the kind of token expected, whether or not we've arrived at the word that's the target of the tab expansion, and how many times the tab key has been pressed. Being able to redefine the various "match a token" functions by calling a single function makes for a much less cluttered grammar definition. It would be helpful for the caller function to return the correct function name and source filename and line, though I can see the latter being somewhat tricky if it's created inside an eval. -Martin On Thu, 6 Oct 2022 at 08:29, Robert Elz wrote: > Date:Wed, 5 Oct 2022 15:36:55 -0400 > From:Chet Ramey > Message-ID: <3d89acac-4c0a-64c9-e22c-1a3ca6860...@case.edu> > > | Other than that, there's no advantage. > > There can be. I have, on occasion (not in bash - I don't > write bash scripts) had a need to redefine one of the standard > commands, while executing a particular function (which calls other > more standard functions which run the command) - and define the > same command differently when running a different function, which > runs the same standard functions running the command, but in a > different way. > > Kind of like > > f1() { > diff() { command diff -u "$@"; } > dostuff > unset -f diff > } > > f2() { > diff() { command diff -iw -c "$@"; } > dostuff > unset -f diff > } > > where dostuff() does what ever is needed to make "newversion", > and then, somewhere does one (or more) of something like > > diff origfile newversion > > "dostuff" can also just be run to get the default diff format. > > or something like that. Real examples tend to be far more complicated > (this simple case could be done just by having DIFFARGS or something, but > that would mean modifying dostuff() to use that as diff $DIFFARGS ) > > kre > > >
Re: declare -F incorrect line number
On Fri, Oct 07, 2022 at 01:28:59AM +1000, Martin D Kealey wrote: > I write nested functions quite often, usually with a subsequent `unset -f` > but sometimes (necessarily) without. > > Being able to write `local -F funcname { ... }` or `function -L funcname { > ... }` would be a nice replacement for the former, but the latter is > usually about different phases of execution, rather than abstractions for > different data. You do realize that there are no "nested functions" in bash, right? All functions exist in a single, global function namespace. unicorn:~$ bash unicorn:~$ f() { g() { echo I am g; }; } unicorn:~$ f unicorn:~$ type g g is a function g () { echo I am g } Functions are never "local".
Re: extglob can be erroneously enabled in bash-5.2 through comsub nesting
On 10/2/22 4:51 AM, Kerin Millar wrote: $ declare -p BASH_VERSION declare -- BASH_VERSION="5.2.0(1)-release" $ BASH_COMPAT=50; shopt extglob; : $(: $(: $(:))); shopt extglob extglob off extglob on Thanks for the report. I've attached the patch I applied to fix this. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/ *** ../bash-5.2-patched/parse.y 2022-08-31 11:47:03.0 -0400 --- parse.y 2022-10-05 11:55:18.0 -0400 *** *** 4230,4234 sh_parser_state_t ps; sh_input_line_state_t ls; ! int orig_ind, nc, sflags, start_lineno; char *ret, *ep, *ostring; --- 4230,4234 sh_parser_state_t ps; sh_input_line_state_t ls; ! int orig_ind, nc, sflags, start_lineno, local_extglob; char *ret, *ep, *ostring; *** *** 4273,4277 expand_aliases = 0; #if defined (EXTENDED_GLOB) ! global_extglob = extended_glob; /* for reset_parser() */ #endif --- 4273,4277 expand_aliases = 0; #if defined (EXTENDED_GLOB) ! local_extglob = global_extglob = extended_glob; /* for reset_parser() */ #endif *** *** 4291,4294 --- 4291,4297 restore_parser_state (&ps); + #if defined (EXTENDED_GLOB) + extended_glob = local_extglob; + #endif token_to_read = 0;
feature request: new builtin `defer`, scope delayed eval
Hey all, I've started working on this idea, and before getting too far I'd like to get general feedback on the feature before going too far. I'd specifically like to propose a new built-in called `defer` which acts like `eval` however is not parsed/expanded/run until it's scope is leaving. Hopefully "scope" is the correct word, I'm imagining it running at the same time a local would go out of "scope" and be no longer available (just before the locals are cleared, so locals can still be used in expansion). The main purpose of defer is to help with resource management, and more specifically cleanup. Today cleaning up resources in scripts whether they be files, virtual-machines/containers, or even global state can be challenging for a variety of reasons. It can be very easy to leave extra state/processes running that you may not mean to. Let's take a look first at handling cleanup while "error mode", aka `set -e` is on (we'll cover error mode being off later below, but we'll start with error mode on. Not only because defer works better here, but also because I think many scripts I write want error mode on as manually checking every command for failure can be tedious.). Today there exists four main ways of handling errors with error mode being on: 1. Introduce another function that "wraps" the previous one, and is capable of cleaning up resources. Then hoping no one calls the internal one, maybe even by giving it a scary name like: `__do_not_use_this_unless_you_want_to_do_cleanup_manually_which_you_better_internal_fn_name()`. 2. Push responsibility onto the caller of the function, by having users manually needing to call a cleanup function afterwards. Meaning just calling: `my_function` is incorrect, and callers need to write: `my_function || { cleanup_function; return 1; }`. 3. Don't add complexity to the caller/wrap in a function, but push complexity onto the author of the function itself by manually adding `|| { cleanup; return 1; }` after every command in the function. 4. Don't attempt to clean up the resource at all. If #4 isn't a viable option, or it is but you'd just prefer not to do it, you're left with three options that each either add significant cognitive complexity, or the chance for misuse (or both!). This is where defer comes in, solving the issue of "cleanup" without actually introducing the chance for missing a cleanup through misuse. A very over-simplified, contrived example is below: ``` #!/usr/bin/env bash set -eo pipefail my_function() { local -r tmp_dir=$(mktemp -d) defer rm -r "${tmp_dir}" value=$(command-that-could-fail --save-state "${tmp_dir}/state") if [ "$value" = "success" ]; then could-fail-two --input "$(< ${tmp_dir}/state)" could-fail-three | pipe echo "commands succeeded" else echo "critical failure exiting entire process" exit 1 fi return 0 } ``` In this case no matter how this function exits where there's a problem with a pipe, a command failing, exiting the entire process, or a simple return out successfully; The resource is guaranteed to be cleaned up, assuming rm doesn't fail -- if it did it would clobber the return code to 1 in this case, even on a return of 0. If your script is running with error mode off on purpose, the benefits drastically fall down to just potential easier readability. Rather then needing to create a cleanup function where validation that cleanup is correct, you can co-locate cleanup with the creation of each item. This could make it very easy to validate multi-step cleanups. No longer do you have to open the cleanup function, and the regular function side by side to validate correctness. Take for example the error mode case I mentioned earlier in this paragraph: ``` #!/usr/bin/env bash scoped_error_mode() { if ! echo -n "$SHELLOPTS" | grep 'errexit' >/dev/null 2>&1; then echo "error mode off, enabling for this function" set -e defer set +e fi if ! echo -n "$SHELLOPTS" | grep 'pipefail' >/dev/null 2>&1; then echo "pipefail off, enabling for this function" set -o pipefail defer set +o pipefail fi my_commands my_other_commands | piped-to } ``` Here not only can we scope normally global states to a single function (allowing us to user error mode just where it might be useful, and not everywhere), but as you can see the defer's are directly next to where they are created which means we don't have to save to variables whether or not we need to "turn things back off" again. This at least for most people I think makes it significantly easier to read. The help for the built-in I've been working on looks like: ``` defer: defer [-l] or defer [-d offset] or defer [arg ...] Execute arguments as a shell command when the current scope exists. Queue up a statement to be eval'd when a scope is left. Runs directly before locals in the same scope get cleared. Deferred statements are run in a last
Re: feature request: new builtin `defer`, scope delayed eval
On Thu, Oct 6, 2022, at 4:08 PM, Cynthia Coan wrote: > I'd specifically like to propose a new built-in called `defer` which > acts like `eval` however is not parsed/expanded/run until it's scope > is leaving. Hopefully "scope" is the correct word, I'm imagining it > running at the same time a local would go out of "scope" and be no > longer available (just before the locals are cleared, so locals can > still be used in expansion). I think it would be more natural to implement function-local RETURN and EXIT traps than introduce a second command that looks like 'trap' and quacks like 'trap' but is actually not 'trap'. This could be done generically by adding the ability to "scope" traps to functions (possibly via a new option to 'trap' or a shopt à la zsh's LOCAL_TRAPS) or specifically by creating "local" variants of RETURN and EXIT. Usage might look like this: f() { # new option trap -f 'cleaning up' EXIT RETURN cmd1 cmd2 cmd3 } or this: g() { # new traps trap 'cleaning up' EXIT_LOCAL RETURN_LOCAL cmdA cmdB cmdC } -- vq
Re: bracket needs to be escaped in variable substitution?
Issue is not reproduced when using a variable as pattern, and it's not related the space character in the pattern: $ ./bash --norc bash-5.2$ var="abcd efgh ijkl mnop qrst" bash-5.2$ pattern=" " bash-5.2$ string="_" bash-5.2$ echo "${var//${pattern}/${string}[${string}}" abcd_[_efgh_[_ijkl_[_mnop_[_qrst bash-5.2$ echo "${var// /${string}[${string}}" bash: bad substitution: no closing `}' in ${var// /${string}[${string}} bash-5.2$ echo "${var//a/${string}[${string}}" bash: bad substitution: no closing `}' in ${var//a/${string}[${string}} On 06/10/2022 16:52, Antoine wrote: Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 uname output: Linux z390 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 (2022-07-23) x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.2 Patch Level: 0 Release Status: release Description: Hello, When using substitution with variables and opening bracket as string "target", the bracket has to be escaped if there is no closing bracket and only when a variable is already used in the string. It wasn't the behavior or previous bash version. Repeat-By: $ ./bash --norc bash-5.2$ var1="qwertyuiop asdfghjkl" bash-5.2$ var2="_" bash-5.2$ echo "${var1// /${var2}[${var2}}" bash: bad substitution: no closing `}' in ${var1// /${var2}[${var2}} # but the following work as expected (escaped bracket): bash-5.2$ echo "${var1// /${var2}\[${var2}}" qwertyuiop_[_asdfghjkl # also working as expected (with closing bracket): bash-5.2$ echo "${var1// /${var2}[${var2}]}" qwertyuiop_[_]asdfghjkl # also working as expected (no variable before the bracket): bash-5.2$ echo "${var1// /[${var2}}" qwertyuiop[_asdfghjkl -- Antoine
Re: feature request: new builtin `defer`, scope delayed eval
I think that's certainly a fair option, and a potential solution. The reason for introducing a new "builtin" as opposed to utilizing a trap is because safely appending to a trap can be filled with holes. Since trap always overwrites what is in the trap, you have to be aware of what is already in the trap, and ensure you are properly appending to it (and that a previous error in the trap processing doesn't affect you). This removes the nice benefit of multi-step setups, so for example if we rewrite the scoped error mode with traps we get: ``` scoped_error_mode() { if ! echo -n "$SHELLOPTS" | grep 'errexit' >/dev/null 2>&1; then echo "error mode off, enabling for this function" set -e trap "set +e" EXIT_LOCAL RETURN_LOCAL fi if ! echo -n "$SHELLOPTS" | grep 'pipefail' >/dev/null 2>&1; then echo "pipefail off, enabling for this function" set -o pipefail if [ "x$(trap -p EXIT_LOCAL)" != "x" ]; then trap "$(trap -p EXIT_LOCAL) ; set +o pipefail" # if dealing with quotes have to sed them out fi if [ "x$(trap -p RETURN_LOCAL)" != "x" ]; then trap "$(trap -p RETURN_LOCAL); set +o pipefail" fi fi my_commands my_other_commands | piped-to } ``` This isn't terrible by any means, and is "more in line" with the existing practices. I still think defer might be more simple, but that is just my opinion! I think both are totally workable. - Cynthia On Thu, Oct 6, 2022 at 4:05 PM Lawrence Velázquez wrote: > > On Thu, Oct 6, 2022, at 4:08 PM, Cynthia Coan wrote: > > I'd specifically like to propose a new built-in called `defer` which > > acts like `eval` however is not parsed/expanded/run until it's scope > > is leaving. Hopefully "scope" is the correct word, I'm imagining it > > running at the same time a local would go out of "scope" and be no > > longer available (just before the locals are cleared, so locals can > > still be used in expansion). > > I think it would be more natural to implement function-local RETURN > and EXIT traps than introduce a second command that looks like > 'trap' and quacks like 'trap' but is actually not 'trap'. This > could be done generically by adding the ability to "scope" traps > to functions (possibly via a new option to 'trap' or a shopt à la > zsh's LOCAL_TRAPS) or specifically by creating "local" variants of > RETURN and EXIT. Usage might look like this: > > f() { > # new option > trap -f 'cleaning up' EXIT RETURN > cmd1 > cmd2 > cmd3 > } > > or this: > > g() { > # new traps > trap 'cleaning up' EXIT_LOCAL RETURN_LOCAL > cmdA > cmdB > cmdC > } > > -- > vq
Re: extglob can be erroneously enabled in bash-5.2 through comsub nesting
On Thu, 6 Oct 2022 15:49:26 -0400 Chet Ramey wrote: > On 10/2/22 4:51 AM, Kerin Millar wrote: > > > $ declare -p BASH_VERSION > > declare -- BASH_VERSION="5.2.0(1)-release" > > $ BASH_COMPAT=50; shopt extglob; : $(: $(: $(:))); shopt extglob > > extglob off > > extglob on > > Thanks for the report. I've attached the patch I applied to fix this. Thanks for the patch. It is probably sufficient for the downstream bug report to be closed. Unfortunately, it remains the case that the >=5.2-rc3 parser is buggy. Consider the following, as conducted using 5.2.2 with said patch applied. $ declare -p BASH_VERSION declare -- BASH_VERSION="5.2.2(1)-release" $ BASH_COMPAT=50 $ [[ foo = $(: $(shopt extglob >&2)) ]] extglob off $ shopt extglob extglob off $ [[ foo = $(: $(shopt extglob >&2) ]] > ^C $ shopt extglob extglob on Note that, within the second test, the comsub parentheses are deliberately imbalanced, causing bash to display the PS2 prompt and wait for further input. I then interrupt bash with ^C and check on the status of the extglob option, only to find that - yet again - it has been unexpectedly enabled. This is perfectly reproducible, provided that those exact steps are carried out as shown. In particular, one cannot skip the first (syntactically correct) test, which must nest at least one comsub within another. -- Kerin Millar