Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Martin D Kealey
On Wed, 31 Jan 2024 at 01:04, Andreas Schwab  wrote:

> On Jan 30 2024, Zachary Santer wrote:
> > There's no way this is the intended behavior, right?
>
> The command is doing exactly what it is documented to do, that is do all
> of the shell word expansions.
>

If that's how the documentation is interpreted, then clearly there's a bug
in the documentation, because this is almost never *useful* behaviour.

It actually says:

> shell-expand-line (M-C-e) ... This performs alias and history expansion
as well as all of the shell word expansions.

It does *not* say that it does quote removal.

It could be argued that quote removal is implicitly required for some of
the steps to proceed, but in that case it can equally be argued that such
removal must therefore be reversed afterwards.


By all means, add a "strip-quotes" command to readline, so the user can use
in the exceptional cases where you want to diverge from what the shell
would have done without M-C-e, but by default any expansion or substitution
that's triggered by a readline command should render a result that's immune
to that same expansion being done again when the user hits enter.

Which quotes to reinstate probably depends on which expansions have already
been done. This includes history expansion when histreedit is on; for
example, this is unhelpful:

$ ! echo Hi
Hi
$ echo !:0!$
[expands history and re-loads input buffer]
$ echo !Hi
bash_5.1.4p47-release: !Hi: event not found

I suggest that the history expansion should note the result of an expansion
includes a history expansion character ("!" by default), and apply a
modification if it would be recognized as such (followed by
[[:alnum:]_:?%*$-], and not in single quotes).

That modification would be:
 - outside quotes, insert \ before a recognized history character; if
necessary mark this byte to prevent it from being doubled by subsequent
expansions.
 - inside single quotes, nothing
 - inside double quotes, insert "" (two double-quote characters) after it.

Similar safeguards would be needed after any other kind of expansion,
except that it suffices to treat "!" as a separator character for the
purpose of deciding whether or not a word needs to be re-quoted. (As long
as it's always single-quoted, or backslash-escaped, the history characters
don't need any other special treatment.)

One way to decide whether any history chars need protection would simply be
to apply history expansion to the result of the requested expansion, and if
that changes the text, something in it needs protecting. Ideally the
history expansion logic would note the location of any history expansion
characters, so that readline could use that as a hint of what to fix.

-Martin


Wrong line numbers in errors

2024-02-03 Thread Matthias Urlichs via Bug reports for the GNU Bourne Again SHell

Version 5.2.15

see example script below, the unassigned $nope is reported to be in line 
4 which is obviously wrong. Removing the parentheses around the echo 
fixes the problem.


#!/bin/bash
set -u
( echo x ) | while read f ; do
    echo Test $f
    echo Test $f
    echo Test $f
    echo Test $f
    echo Test $f
done
echo $nope

--
-- mit freundlichen Grüßen
--
-- Matthias Urlichs



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re[2]: wait skips signals but first one

2024-02-03 Thread Mykyta Dorokhin


Like you, I can't reproduce it on the desktop platforms I have available
right now.

The bash devel git branch has fairly fine granularity. If you can automate
the signal sending somewhat, maybe by having a child process send signals
to $$, you could use your script and `git bisect' to find the commit where
the behavior changed. bash-5.0 was frozen 12/31/2018, and bash-5.1 was
frozen 12/14/2020, so that should get you started with the devel branch
commits you want to inspect.

http://git.savannah.gnu.org/cgit/bash.git/log/?h=devel

I have found the commit on devel branch which breaks things for me (and 
probably other Yocto-based builds):

This one still works
==

commit 89d788fb0152724a93e0fdab8c15116e5c76572b
Author: Chet Ramey 
Date:   Mon Feb 17 11:41:35 2020 -0500

   commit bash-20200214 snapshot

This one not
==


commit 0df4ddca3f371bc258fe4185cdec36fce3e7be7b
Author: Chet Ramey 
Date:   Mon Feb 24 10:41:37 2020 -0500

   commit bash-20200221 snapshot


Please take a look. Maybe you'll notice something suspicious there. I don't 
know... uninitialized variables, endian-dependent code, etc.


Thank you,
Mykyta
 






Re[2]: wait skips signals but first one

2024-02-03 Thread Mykyta Dorokhin
Hello again,


Here is another analysis that my collegue made on the issue:



Bash Compiled for wrong OS?

Analysis with strace.

After receiving SIGUSR1, Debian only blocks SIGCHLD, then clears the block:

205295 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=205327, 
si_uid=1040} ---
205295 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
205295 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7f063bdb5fd0}, {sa_handler=0x5637247940b0, 
sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f063bdb5fd0}, 8) = 0
205295 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
205295 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0  # unblocks all signalas


The above is the correct action.

On our device, it blocks SIGUSR1 as well as SIGCHLD and keeps doing it over and 
over again:

6707  --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=6724, si_uid=0} ---
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, NULL, [USR1 CHLD], 8) = 0
6707  write(1, ">>> TRAPPED USR1 <<<\n", 21) = 21
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  write(1, "Iteration\n", 10)       = 10
6707  rt_sigprocmask(SIG_BLOCK, NULL, [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [USR1 CHLD], 8) = 0
6707  clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x76fe9028) 
= 6725
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigaction(SIGINT, {sa_handler=0x46e15, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x76e90711},  
6707  <... rt_sigaction resumed>{sa_handler=0x46e15, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x76e90711}, 8) = 0
6707  wait4(-1,  

On modern systems, the OS blocks the signal that is caught during signal 
handling, and unblocks so that signal handlers are not called recursively.  The 
exception to this is if SA_NODEFER is set. On some very old UNIX systems you 
had to block the signal yourself, and there was a small window where things 
could go wrong. I suspect BASH probably has a build option to allow blocking 
signals in handlers for compatibility with other systems, and is not being 
built correctly for Linux. I suspect on those very old systems the signal was 
automatically unblocked on return, but is not done here, because the POSIX 
sigprocmask is called, which requires calling it again to unblock the signal in 
Linux.  And since wait is restarted, it never is unblocked.

According to strace no additional user flags are set when the BASH signal 
handler is put in place for SIGUSR1.

We need to look at bash build options, and possible the signal handling code, 
and sigprocmask or whatever C API they are using to call sigprocmask().


Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Koichi Murase
I was just watching the discussion because I thought the existing
behavior would never be changed as it behaved this way for a long
time. However, I now think I should leave a comment because even
Martin (who I believed was one of the careful people about backward
compatibility as seen in [1,2]) seems to suggest a breaking change.

https://lists.gnu.org/archive/html/bug-bash/2022-11/msg00017.html
https://lists.gnu.org/archive/html/bug-bash/2023-11/msg00150.html

2024年2月3日(土) 21:45 Martin D Kealey :
> If that's how the documentation is interpreted, then clearly there's a bug
> in the documentation, because this is almost never *useful* behaviour.

This is not true. Before Bash 3.2, READLINE_LINE and READLINE_POINT
were not supported, where Bash configurations used readline macros
(something like « bind '"xxx": "\C-u\C-k$(...)\C-\M-e\r"' ») to
replace the command line with a dynamic content. I.e., the macro first
inserts a text « $(some-code-generator) » and expands it by C-M-e.
Please note that $(some-code-generator) may want to include in a
generated command characters that should be treated as shell special
characters. If one wants to support a wide range of versions from Bash
3.2 (shipped with macOS) to the latest one, this is still one of the
used techniques. If the behavior of C-M-e or shell-expand-line is
going to be changed, such types of Bash configurations are all broken.
For example, if $(some-code-generator) generates « echo "Hello,
world!" », we have been expecting the output « Hello, world! ». If the
quoting were going to be applied in the next version, the output would
become « "Hello, world!" », which is not an expected behavior.

If the requoting behavior would be desired, I strongly suggest keeping
the existing behavior of shell-expand-line but adding a separate new
bindable function (like shell-expand-and-requote-line) to perform the
expansion and requoting.

--
Koichi



Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Koichi Murase
> Before Bash 3.2

P.S. Sorry, I meant "In Bash 3.2 and before,", where 3.2 is included
in the statement.



Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Chet Ramey

On 1/29/24 12:51 AM, A4-Tacks wrote:


Bash Version: 5.2
Patch Level: 21
Release Status: release

Description:
  input M-C-e (shell-expand-line),
  expand the result containing `'` (single quotation mark) failed.

Repeat-By:
  ```bash
  # echo "$(echo $'ab\n\'cd\nef')"
  ab
  'cd
  ef
  # echo "$(echo $'ab\n\'cd\nef')"  # input M-C-e
  # echo "$(echo $'ab\n\'cd\nef')"bash: bad substitution: no closing `)' in echo 
"$(echo $'ab\n\'cd\nef')"
  ```


Thanks for the report. This was a pretty easy fix in the end, just needed
to make sure the completion/readline flags were propagated through the
expansion code properly.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Error with SIGCHLD trap and process substitution

2024-02-03 Thread Chet Ramey

On 2/2/24 3:32 PM, Tavian Barnes wrote:


Bash Version: 5.2
Patch Level: 26
Release Status: release

Description:

When a SIGCHLD trap is set, commands that include two process
substitutions fail with a strange "unexpected EOF" error.



Thanks for the report. This was fixed last September, the result of
https://lists.gnu.org/archive/html/bug-bash/2023-09/msg00058.html .

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Wrong line numbers in errors

2024-02-03 Thread Chet Ramey
On 2/3/24 8:36 AM, Matthias Urlichs via Bug reports for the GNU Bourne 
Again SHell wrote:

Version 5.2.15

see example script below, the unassigned $nope is reported to be in line 4 
which is obviously wrong. Removing the parentheses around the echo fixes 
the problem.


#!/bin/bash
set -u
( echo x ) | while read f ; do
     echo Test $f
     echo Test $f
     echo Test $f
     echo Test $f
     echo Test $f
done
echo $nope


Thanks for the report. This was fixed in September, 2022, the result of
https://savannah.gnu.org/support/index.php?110714 .


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH] printf: more error handling

2024-02-03 Thread Chet Ramey

On 2/2/24 6:33 PM, Grisha Levit wrote:

On Thu, Feb 1, 2024 at 7:41 PM Chet Ramey  wrote:


On 1/22/24 9:44 PM, Grisha Levit wrote:

The size of the buffer used for printf -v is tracked in an int but this
can overflow since the buffer can be built up by multiple vsnprintf(3)
calls, each of which can append up to INT_MAX bytes to the buffer:


Thanks for the report and patch.


Thanks, a small question -- in your commit[1] you added an ferror check here:


No, I just left it in.



177   nw = vflag ? vbprintf (f, func) : printf (f, func); \
178 if (nw < 0 || ferror (stdout)) \
179   { \

Is it necessary to check the error indicator if printf(3) just had a non-
negative return? 


I think printf is allowed to set the error flag that ferror checks even if
it returns 0, but I could be convinced otherwise.



And, if so, can this check be, as elsewhere:
`(vflag == 0 && ferror (stdout))'?  I think `printf -v' should work no
matter what the state of stdout is, no?


Sure.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wait skips signals but first one

2024-02-03 Thread Chet Ramey

On 2/3/24 10:00 AM, Mykyta Dorokhin wrote:

I have found the commit on devel branch which breaks things for me (and 
probably other Yocto-based builds):


This one still works
==

commit 89d788fb0152724a93e0fdab8c15116e5c76572b
Author: Chet Ramey 
Date:   Mon Feb 17 11:41:35 2020 -0500

    commit bash-20200214 snapshot

This one not
==


commit 0df4ddca3f371bc258fe4185cdec36fce3e7be7b
Author: Chet Ramey 
Date:   Mon Feb 24 10:41:37 2020 -0500

    commit bash-20200221 snapshot



Please take a look. Maybe you'll notice something suspicious there. I don't 
know... uninitialized variables, endian-dependent code, etc.


There are changes there, of course, but it's hard to see how they make a
difference. The wait builtin was changed not to interrupt the wait for a
trapped SIGCHLD, but to delay running any SIGCHLD trap until the wait
exited. Since your example doesn't trap SIGCHLD, it doesn't seem
significant. Any other trapped signal still interrupts the wait. Subshells
clear the process substitution FIFO list, but you're not using process
substitution.

The one change that might make a difference is a bug fix: if the wait
builtin is waiting for a process and receives a trapped signal, it's
supposed to cause wait to return immediately and then run the trap. Bash
didn't do that consistently for SIGINT, and would run the trap when it
shouldn't, or before it should, and sometimes not return from the wait
at all. So maybe the longjmp back to the wait builtin is what changed
things, even though longjmp is one of the functions that POSIX says is
safe to call from a signal handler context, and it restores the signal
mask if you're running on a system that has sigsetjmp/siglongjmp.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Wrong line numbers in errors

2024-02-03 Thread Matthias Urlichs via Bug reports for the GNU Bourne Again SHell

Hello Chet,

The release of 5.2 is from October 2022 and thus should include a patch 
for this issue.


To double-check, I just downloaded, patched up, and built 5.2.26. It 
also fails.

Thus I conclude that your fix from 2022 was incomplete.

Proof:

~/src/bash/bash-5.2.21$ grep define.PATCH patchlevel.h
#define PATCHLEVEL 26
~/src/bash/bash-5.2.21$ make
[…]
     * GNU bash, version 5.2.26(1)-release (x86_64-pc-linux-gnu)
~/src/bash/bash-5.2.21$ ./bash /tmp/error.sh
Test x
Test x
Test x
Test x
Test x
/tmp/error.sh: line 4: nope: unbound variable

#!/bin/bash
set -u
( echo x ) | while read f ; do
 echo Test $f
 echo Test $f
 echo Test $f
 echo Test $f
 echo Test $f
done
echo $nope


Thanks for the report. This was fixed in September, 2022, the result of
https://savannah.gnu.org/support/index.php?110714 . 



--
-- mit freundlichen Grüßen
--
-- Matthias Urlichs



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wait skips signals but first one

2024-02-03 Thread Chet Ramey

On 2/3/24 10:28 AM, Mykyta Dorokhin wrote:


Analysis with strace.

After receiving SIGUSR1, Debian only blocks SIGCHLD, then clears the block:

205295 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=205327, 
si_uid=1040} ---

205295 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
205295 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7f063bdb5fd0}, 
{sa_handler=0x5637247940b0, sa_mask=[], sa_flags=SA_RESTORER, 
sa_restorer=0x7f063bdb5fd0}, 8) = 0

205295 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
205295 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0  # unblocks all signalas


The above is the correct action.

On our device, it blocks SIGUSR1 as well as SIGCHLD and keeps doing it over 
and over again:


One explanation for this is SIGUSR1 being blocked when the shell is
invoked. Another is that sigsetjmp/siglongjmp are either not available
(or configure doesn't think they are) or don't properly save and restore
the signal mask.



6707  --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=6724, 
si_uid=0} ---

6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, NULL, [USR1 CHLD], 8) = 0
6707  write(1, ">>> TRAPPED USR1 <<<\n", 21) = 21
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  rt_sigprocmask(SIG_BLOCK, [CHLD], [USR1 CHLD], 8) = 0
6707  rt_sigprocmask(SIG_SETMASK, [USR1 CHLD], NULL, 8) = 0
6707  write(1, "Iteration\n", 10)       = 10


On modern systems, the OS blocks the signal that is caught during signal 
handling, and unblocks so that signal handlers are not called recursively. 
  The exception to this is if SA_NODEFER is set. On some very old UNIX 
systems you had to block the signal yourself, and there was a small window 
where things could go wrong. I suspect BASH probably has a build option to 
allow blocking signals in handlers for compatibility with other systems, 
and is not being built correctly for Linux. 


Bash does have an autoconf test for this, but it didn't change as part
of this push. You can check what MUST_REINSTALL_SIGHANDLERS is set to
in config.h, but I suspect it won't be different.

And `not being built correctly for Linux' would mean your Debian and my
Red Hat tests would fail.


I suspect on those very old 
systems the signal was automatically unblocked on return, but is not done 
here, because the POSIX sigprocmask is called, which requires calling it 
again to unblock the signal in Linux.  And since wait is restarted, it 
never is unblocked.


If you mean wait(2), it doesn't get restarted. waitpid(2) will return
-1/EINTR since it received a caught signal.



According to strace no additional user flags are set when the BASH signal 
handler is put in place for SIGUSR1.


Correct, the trap signal handler doesn't assume that system calls are
restarted.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Wrong line numbers in errors

2024-02-03 Thread Chet Ramey

On 2/3/24 3:09 PM, Matthias Urlichs wrote:

Hello Chet,

The release of 5.2 is from October 2022 and thus should include a patch for 
this issue.


No. bash-5.2 was frozen earlier in September, 2022 and released September
26, 2022. That fix came in after the release (and wouldn't have been
enough to justify a fifth release candidate, anyway).


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: glob-expand-word and vi-command mode

2024-02-03 Thread Chet Ramey

On 2/2/24 9:39 AM, Mike Jonkmans wrote:

 From the manual, glob-expand-word:
   The word before point is treated as a pattern for pathname expansion,
   and the list of matching filenames is inserted, replacing the word.
   If a numeric argument is supplied, an asterisk is appended before
   pathname expansion.


glob-expand-word doesn't work that great in vi command mode, mostly for
the reasons you suspect. What made you use it over the standard vi mode
binding for `*' (vi-complete)? That behaves like POSIX says it's supposed
to in vi command mode.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Slow history load with some values of HISTSIZE

2024-02-03 Thread Casey Johnson
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -flto=auto -ffat-lto-objects -flto=auto 
-ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security 
-Wall
uname output: Linux maxwell 5.19.0-50-generic #50-Ubuntu SMP PREEMPT_DYNAMIC 
Mon Jul 10 18:24:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.1
Patch Level: 16
Release Status: release

Description:

  The current implementation of the command history takes a long time to 
load
  a large HISTFILE when history is stifled.  This is because there is a
  memmove() for every line of HISTFILE once HISTSIZE lines have been loaded.
  If N is the number of lines in HISTFILE then the cost of the memmoves ends
  up being something like (N - HISTSIZE) * HISTSIZE, peaking when HISTSIZE 
is
  roughly N/2.

In my case, the history file is over 300,000 lines.  If the history is
unstifled, it takes about 0.1 seconds to load.  With HISTSIZE around 200,000
it takes over 10 seconds to load, i.e., roughly 100 times as long.

Repeat-By:

Create a text file 'alt-history.txt' with a few hundred thousand lines, say
300k.  The content doesn't matter much.  For example, you can just cat
several copies of your actual history file to make one large file.

In a clean shell, execute:
HISTFILE=alt-history.txt
HISTSIZE=15
history -r
and then observe how long the last command runs before returning.

For comparison, in another clean shell, execute the same commands, with
unstifled history:
HISTFILE=alt-history.txt
HISTSIZE=-1
history -r
and then observe how long this takes.

If you repeat with larger and larger history files, you will observe the
runtime's quadratic growth.

Fix:

My proposed fix consists of a couple of components, which are included in
the patched history.c that is attached below:

 1. Currently, history reallocation is amortized by requesting 50
(DEFAULT_HISTORY_GROW_SIZE) extra slots at once.  We can do something
similar with a stifled history: allocate extra slots when HISTSIZE has
been reached.  Each time a new line is added, simply free the first
(oldest) entry and increment 'the_history' so the last entry is now
available.  Slide the window by one each time and only do a memmove()
when the extra slots are all gone.  Even allocating only 50 extra slots
reduces the memmove() work by a factor of 50, which brings the time down
to friendly territory.

 2. Vary the number of extra slots allocated as the history gets longer.  
The
current value of 50 is adequate when the history is small, but for a 
long
history it would be good to increase it so the work (both memmove() and
realloc()) amortizes better.  For example, a value of 1024 would only
preallocate 8KB on a 64-bit machine and 4KB on a 32-bit machine.  If
memory is a concern for some, this value could be made a configure-time
constant.  A better, more universal, approach would be to dynamically
choose the number of extra slots based on the length (N) of the history.
For example, if the extra slots are roughly equal to sqrt(N), with a
minimum of 50, the memmove() work becomes roughly N*log(N) instead of
N^2.  This works well for small values, but for absurdly large ones, 
too.

I include below a patched history.c that implements both of these.  The
array management is all handled by new functions I called advance_history()
and hist_shift_resize().  Each time a reallocation happens, it grows by
roughly 2*sqrt(N).

The tarball also contains a second version of the file, which includes some
error checking code and a realloc() if HISTSIZE is substantially reduced.
Those pieces may be worth consideration, but I feel less strongly about
them.  I based these commits on git commit
f3b6bd19457e260b65d11f2712ec3da56cef463f from Github.

In making these patches, I have endeavored to maintain the existing style
and functionality, touching as few functions as possible.  Thank you for
your consideration and I welcome any feedback you may have.

Casey Johnson


-- BEGIN base64-encoded bash-history.tar.gz --
H4sIA+08a3fbNpb9Gv0KND3TSLb8kPPoNI7d4yRyol3XzrGdJtlORkOLkMSEIlWS8qNt
9rfvfQAgAFJy0uk+5ix15kxqEri49+K+L4g4utjKZBDGUSK3plFepNnN5uirP/W3Db9HDx7gv73v
Hm7b/8Lvwf2H2ztf9R48fPjdg++2Hzy4/9V27+FO7/5XYvvPRaP+t8iLIBPiqw/pNMnTZOm4297/
i/621oTZdbGxIYAbSRjEaSL1cxFHF1kA/65ttVow/Fk6v8miybQQ7Wcd0fv+r99v7Gzv9MRhJqU4
S8fFVZBJcZguAE4RpUlXDJLRZqslhDgHkGIcxVKM0qQIoiQXxVSKF8evxUu12JFarK0edLoiELks
RDpGCFm6KEBSAUqaiVmQBJMomRCQQl7jIDHP5GWULvL4RhQ3cxkKlOyc19eLIBaIba6wfSxu0oUY
BYnIZAhjsuhiUUgRFQKYsYUrpWE0vkEQ8AwIk5laM5vluKim4oVMZBbE

Re: glob-expand-word and vi-command mode

2024-02-03 Thread Chet Ramey

On 2/2/24 5:15 PM, Mike Jonkmans wrote:

On Fri, Feb 02, 2024 at 09:50:46AM -0500, Greg Wooledge wrote:

On Fri, Feb 02, 2024 at 03:39:54PM +0100, Mike Jonkmans wrote:

[ mkdir test; cd test; touch file1 file2 ]

Going into `vi-command' mode on the line `ls *' puts the cursor on the `*'.
Then `glob-expand-word' does nothing with the `*', it just inserts a space.
Resulting in `ls  *' (cursor still on `*').
Expected: nothing happens.


I'm not sure what keystrokes you're actually using, or what bind calls
you've done leading up to this, but in a vanilla instance of bash with
nothing done except 'set -o vi', typing

l s space * esc *

will replace the * with file1 file2 and another space, and also puts
you in insert mode for some reason.  Probably historical.


esc * is bound to insert-completions


It's bound to vi-complete, which bash replaces with something that does
the pathname expansion that POSIX requires.


(which may be better than glob-expand-word, as it doesn't need a glob)


Well, it appends a `*' if the word doesn't have any globbing chars.

If your goal is to be in vi command mode when the command completes, why
not use a macro?

bind -m vi-command '"\C-f":"*\e"

does that.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: glob-expand-word and vi-command mode

2024-02-03 Thread Mike Jonkmans
On Sat, Feb 03, 2024 at 03:43:45PM -0500, Chet Ramey wrote:
> On 2/2/24 9:39 AM, Mike Jonkmans wrote:
> >  From the manual, glob-expand-word:
> 
> glob-expand-word doesn't work that great in vi command mode, mostly for
> the reasons you suspect. What made you use it over the standard vi mode
> binding for `*' (vi-complete)? That behaves like POSIX says it's supposed
> to in vi command mode.

I made some mappings on all *-expand-* variations for the discussion 
on shell-expand-case.
The '*' mapping seams more appropiate.

But the vi-command mode puts the cursor on the last character,
instead of one past it (as it is in vi-insert-mode).

That has the effect that '*' does not work easily.
If you type, in vi-insert-mode:
   l s space  *
files starting with 'ls' will be completed.
Expected all files to be inserted.
Not sure if that is fixable.

-- 
Regards, Mike Jonkmans



Re: glob-expand-word and vi-command mode

2024-02-03 Thread Mike Jonkmans
On Sat, Feb 03, 2024 at 04:59:08PM -0500, Chet Ramey wrote:
> On 2/2/24 5:15 PM, Mike Jonkmans wrote:
> > On Fri, Feb 02, 2024 at 09:50:46AM -0500, Greg Wooledge wrote:
> > > On Fri, Feb 02, 2024 at 03:39:54PM +0100, Mike Jonkmans wrote:
> > > > [ mkdir test; cd test; touch file1 file2 ]
> > > > Going into `vi-command' mode on the line `ls *' puts the cursor on the 
> > > > `*'.
> > > > Then `glob-expand-word' does nothing with the `*', it just inserts a 
> > > > space.
> > > > Resulting in `ls  *' (cursor still on `*').
> > > > Expected: nothing happens.
> > > I'm not sure what keystrokes you're actually using, or what bind calls
> > > you've done leading up to this, but in a vanilla instance of bash with
> > > nothing done except 'set -o vi', typing
> > > l s space * esc *
> > > will replace the * with file1 file2 and another space, and also puts
> > > you in insert mode for some reason.  Probably historical.
> > esc * is bound to insert-completions
> 
> It's bound to vi-complete, which bash replaces with something that does
> the pathname expansion that POSIX requires.

How can I find this out?
   INPUTRC=/dev/null bash --norc --noprofile
   set -o vi
   bind -m vi-insert -p
   bind -m vi-command -p
Both show no bindings for "*".
(`man 3 readline' mentions "*" on vi-complete in command mode)

> > (which may be better than glob-expand-word, as it doesn't need a glob)
> Well, it appends a `*' if the word doesn't have any globbing chars.

Seems logical.

> If your goal is to be in vi command mode when the command completes, why
> not use a macro?
> bind -m vi-command '"\C-f":"*\e"
> does that.

Clever.

-- 
Regards, Mike Jonkmans



Re[2]: wait skips signals but first one

2024-02-03 Thread Mykyta Dorokhin
Hello,

Again, I'm on ("commit bash-20200221 snapshot") commit, the one I  think breaks 
things:

https://git.savannah.gnu.org/cgit/bash.git/commit/?h=devel&id=0df4ddca3f371bc258fe4185cdec36fce3e7be7b

There is a line in trap.c with your change. If I revert it then everything 
works again:

- if (interrupt_immediately && wait_intr_flag)
+ if (/* interrupt_immediately && */wait_intr_flag)

So if I put interrupt_immediately back and rebuild the code with thes only fix 
then it starts working properly, signals are getting received as expected.

Can you comment? Maybe you want me to provide some additional debug info?

Thank you,
Mykyta




3 февраля 2024, 22:09:33, от "Chet Ramey" :

On 2/3/24 10:00 AM, Mykyta Dorokhin wrote:

> I have found the commit on devel branch which breaks things for me (and 
> probably other Yocto-based builds):
> 
> This one still works
> ==
> 
> commit 89d788fb0152724a93e0fdab8c15116e5c76572b
> Author: Chet Ramey 
> Date:   Mon Feb 17 11:41:35 2020 -0500
> 
>     commit bash-20200214 snapshot
> 
> This one not
> ==
> 
> 
> commit 0df4ddca3f371bc258fe4185cdec36fce3e7be7b
> Author: Chet Ramey 
> Date:   Mon Feb 24 10:41:37 2020 -0500
> 
>     commit bash-20200221 snapshot
> 
> 
> 
> Please take a look. Maybe you'll notice something suspicious there. I don't 
> know... uninitialized variables, endian-dependent code, etc.

There are changes there, of course, but it's hard to see how they make a
difference. The wait builtin was changed not to interrupt the wait for a
trapped SIGCHLD, but to delay running any SIGCHLD trap until the wait
exited. Since your example doesn't trap SIGCHLD, it doesn't seem
significant. Any other trapped signal still interrupts the wait. Subshells
clear the process substitution FIFO list, but you're not using process
substitution.

The one change that might make a difference is a bug fix: if the wait
builtin is waiting for a process and receives a trapped signal, it's
supposed to cause wait to return immediately and then run the trap. Bash
didn't do that consistently for SIGINT, and would run the trap when it
shouldn't, or before it should, and sometimes not return from the wait
at all. So maybe the longjmp back to the wait builtin is what changed
things, even though longjmp is one of the functions that POSIX says is
safe to call from a signal handler context, and it restores the signal
mask if you're running on a system that has sigsetjmp/siglongjmp.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


Feature request: prompt strings in output from edit-and-execute-command readline function ( was About `M-C-e` expand result `'` failed )

2024-02-03 Thread Zachary Santer
On Fri, Feb 2, 2024 at 4:21 PM Chet Ramey  wrote:

> OK, I'll take that as a feature request for a future version.
>

While I'm making feature requests.

I hit C-x C-e and enter the following into my editor:

var='duck'
declare -p var
(
  var='squirrel'
  declare -p var
)
declare -p var

I save that and exit my editor, and this is what I get in the terminal:

zsant@Zack2021HPPavilion MINGW64 ~
$
var='duck'
declare -p var
declare -- var="duck"
(
  var='squirrel'
  declare -p var
)
declare -- var="squirrel"
declare -p var
declare -- var="duck"

Kinda confusing, right?

I could see adding PS1 and PS2 prompts where they would've been, had I
typed all my commands into the terminal directly. PS2 prompts in front of
everything that was entered in the editor might be preferable, though,
given the complex PS1s you see sometimes.

PS1s and PS2s:

zsant@Zack2021HPPavilion MINGW64 ~
$ var='duck'

zsant@Zack2021HPPavilion MINGW64 ~
$ declare -p var
declare -- var="duck"

zsant@Zack2021HPPavilion MINGW64 ~
$ (
>   var='squirrel'
>   declare -p var
> )
declare -- var="squirrel"

zsant@Zack2021HPPavilion MINGW64 ~
$ declare -p var
declare -- var="duck"

All PS2s:

> var='duck'
> declare -p var
declare -- var="duck"
> (
>   var='squirrel'
>   declare -p var
> )
declare -- var="squirrel"
> declare -p var
declare -- var="duck"

I do think I prefer all PS2s, myself. Makes a distinction between things
entered in a text editor with edit-and-execute-command and things entered
directly on the command line, and also makes a distinction between commands
and their output.

Would be nice.

Zack


Re: Feature request: prompt strings in output from edit-and-execute-command readline function ( was About `M-C-e` expand result `'` failed )

2024-02-03 Thread Oğuz
On Sunday, February 4, 2024, Zachary Santer  wrote:
>
> While I'm making feature requests.
>

I want to make one too. Multiline commands retrieved from history can be
short-circuited like this:

  $ ls
  x  y
  $ echo '
  > rm *
  > ' >z
  $
  $ ls
  x  y  z
  $ echo ''
  rm *
  ' >z

  > ^C
  $ ls
  $

After the second invocation of `ls' I add a single quote and hit enter
accidentally, and lose all my files. I wish it didn't happen


-- 
Oğuz


Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Martin D Kealey
On Sun, 4 Feb 2024, 02:01 Koichi Murase,  wrote:

> I now think I should leave a comment because even Martin (who I believed
> was one of the careful people about backward compatibility as seen in
> [1,2]) seems to suggest a breaking change.
>

That's a fair point.

I am generally concerned about breaking changes that affect existing
scripts, but I see changes to readline as less problematic, since an
interactive user gets the chance to check the replacement before hitting
enter.

Yes users have to learn new behaviours, which isn't ideal, but if my
suggested "unquote" ("dequote"? "strip-quoting"?) bindable function was
also added, the impact on users would be minimal: press M-C-e and then a
second keypress to remove quotes, bringing the input buffer to the same
state as would occur under the current arrangement. (By choosing a default
binding for "unquote" that currently does nothing, people could then use
the same keyboard arpeggio on both old and new bash.)

If the requoting behavior would be desired, I strongly suggest keeping the
> existing behavior of shell-expand-line but adding a separate new
> bindable function (like shell-expand-and-requote-line) to perform the
> expansion and requoting.
>

On reflection, this would be a fair compromise, at least in the short term.

Might we offer guidance that distros include a new binding for C-M-e in
their supplied /etc/skel which would only affect new users, not existing
users?

-Martin


Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Koichi Murase
Thank you for the discussion.

2024年2月4日(日) 12:59 Martin D Kealey :
> I am generally concerned about breaking changes that affect existing
> scripts, but I see changes to readline as less problematic,

I also assume shell scripts, but shell scripts for interactive
settings.  Interactive settings can also be a large-scale shell
script.  There are many frameworks---such as fzf, atuin, autojump,
zoxide, sbp, liquidprompt, etc.

> since an interactive user gets the chance to check the replacement
> before hitting enter.

I don't think we can ensure that the users *always* have a chance to
check it.  It is possible to set up the readline macro so that it
immediately runs the expanded command (as I implied in my previous
reply by including \r at the end of an example macro string).  If it
is possible, then there are always some users utilizing the behavior
in the world no matter how one person considers "this is almost never
useful" or "an interactive user gets the chance".

In fact, I think there is a motivation to do so: To force
re-evaluation of PS1 to update the prompt (e.g. after changing the
current directory), a command needs to be executed through the
readline bindable function `accept-line'.  For example, one can
generate « cd '...' » by "\M-\C-e" and execute it by C-m in a single
macro string.

> Yes users have to learn new behaviours, which isn't ideal, but if my
> suggested "unquote" ("dequote"? "strip-quoting"?) bindable function
> was also added, the impact on users would be minimal: press M-C-e
> and then a second keypress to remove quotes, bringing the input
> buffer to the same state as would occur under the current
> arrangement. (By choosing a default binding for "unquote" that
> currently does nothing, people could then use the same keyboard
> arpeggio on both old and new bash.)

That's right, but I'm not talking about the cases where the user
directly inputs "\M-\C-e" on the keyboard.  I'm talking about macros
set up using « bind '"...": "..."' » by shell frameworks.  To me, the
above paragraph seems to be irrelevant to the present discussion.

> > If the requoting behavior would be desired, I strongly suggest
> > keeping the existing behavior of shell-expand-line but adding a
> > separate new bindable function (like
> > shell-expand-and-requote-line) to perform the expansion and
> > requoting.
>
> On reflection, this would be a fair compromise, at least in the
> short term.

Does it need to be short-term?  Do we need to remove the feature?

> Might we offer guidance that distros include a new binding for C-M-e
> in their supplied /etc/skel which would only affect new users, not
> existing users?

I have a native question.  Why do people on this thread discuss
changing the behavior of "\M-\C-e" even though there is still a large
space of key combinations?  "\M-\C-e" is already used in existing
shell frameworks for interactive uses, e.g. an old version of fzf
keybindings used "\M-\C-e" in macros for every Bash version (yet
currently only for bash <= 3.2).  There would be still customized
versions of fzf that are not updated, or other personal configurations
inspired by the fzf keybindings.  I don't see a reason to introduce
unnecessary conflicts when we can just pick another key combination
(e.g. "\M-\C-x") for `shell-expand-and-requote-line'.

--
Koichi



Re: About `M-C-e` expand result `'` failed

2024-02-03 Thread Martin D Kealey
On Sun, 4 Feb 2024 at 15:17, Koichi Murase  wrote:

> 2024年2月4日(日) 12:59 Martin D Kealey :
> > I am generally concerned about breaking changes that affect existing
> scripts, but I see changes to readline as less problematic,
>
> I also assume shell scripts, but shell scripts for interactive settings.
> Interactive settings can also be a large-scale shell script.


That is a possibility that I hadn't considered. Thankyou for bringing it to
my attention.

> On reflection, this would be a fair compromise, at least in the short
> term.
>
> Does it need to be short-term?  Do we need to remove the feature?
>

Not necessarily; I just hadn't thought it through enough to convince myself
that it should necessarily remain, so I didn't want to commit to the long
term.

> Might we offer guidance that distros include a new binding for M-C-e in
> their supplied /etc/skel which would only affect new users, not existing
> users?
>
> I have a native question.  Why do people on this thread discuss changing
> the behavior of "\M-\C-e" even though there is still a large space of key
> combinations?  "\M-\C-e" is already used [lots of places...]
> I don't see a reason to introduce unnecessary conflicts when we can just
> pick another key combination (e.g. "\M-\C-x") for
> `shell-expand-and-requote-line'.
>

When you put it like that, it seems entirely reasonable that M-C-e should
remain as-is.
Consider all my previous suggestions to the contrary withdrawn.

-Martin

PS: Sadly M-C-r seems to be already taken, so I can't just hop one key over.