Re: recent typo in sig.c breaks Minix compilation

2024-10-18 Thread Chet Ramey

On 10/17/24 9:15 PM, Martin D Kealey wrote:

It looks like a recent (last year) typo in sig.c breaks Minix compilation:


Thanks.


Talking of which, I note several places where there's a construct like:

#ifdef FOO
if (foo && zot)
#else
if (zot)
#endif


Unfortunately this confuses both the indent program and some editors.


I'm good with that style. I prefer it, I don't use `indent', and my
editor doesn't mind it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: recent typo in sig.c breaks Minix compilation

2024-10-18 Thread Oğuz
On Friday, October 18, 2024, Martin D Kealey 
wrote:
>
> The problematic code for editors looks like this:
>
> #ifdef FOO
> if (foo && zot) {
> #else
> if (bar && zot) {
> #endif
>
> Vim certainly does not like this,
>

How about this:

#ifdef FOO
if (foo && zot)
#else
if (bar && zot)
#endif
{


-- 
Oğuz


[PATCH] isearch: fix with HANDLE_MULTIBYTE && LC_ALL=C

2024-10-18 Thread Grisha Levit
Sorry, previous patch should have been:
---
 lib/readline/isearch.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/readline/isearch.c b/lib/readline/isearch.c
index 9ddf9ce3..c6d8c8a8 100644
--- a/lib/readline/isearch.c
+++ b/lib/readline/isearch.c
@@ -742,10 +742,11 @@ opcode_dispatch:
 /* Add character to search string and continue search. */
 default:
 #if defined (HANDLE_MULTIBYTE)
-  wlen = (cxt->mb[0] == 0 || cxt->mb[1] == 0) ? 1 : RL_STRLEN (cxt->mb);
-#else
-  wlen = 1;
+  if (MB_CUR_MAX > 1 && rl_byte_oriented == 0)
+   wlen = (cxt->mb[0] == 0 || cxt->mb[1] == 0) ? 1 : RL_STRLEN (cxt->mb);
+  else
 #endif
+   wlen = 1;
   if (cxt->search_string_index + wlen + 1 >= cxt->search_string_size)
{
  cxt->search_string_size += 128;   /* 128 much greater than 
MB_CUR_MAX */
-- 
2.47.0




[PATCH] isearch: fix with HANDLE_MULTIBYTE && LC_ALL=C

2024-10-18 Thread Grisha Levit
HISTFILE= INPUTRC=/ LC_ALL=C bash --norc -in <<< $'\cR.'

WARNING: MemorySanitizer: use-of-uninitialized-value
 #0 _rl_isearch_dispatch isearch.c:745:31
 #1 rl_search_historyisearch.c:926:11
 #2 rl_reverse_search_historyisearch.c:135:11
 #3 _rl_dispatch_subseq  readline.c:941:8
---
 lib/readline/isearch.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/readline/isearch.c b/lib/readline/isearch.c
index 9ddf9ce3..09e981e6 100644
--- a/lib/readline/isearch.c
+++ b/lib/readline/isearch.c
@@ -742,9 +742,11 @@ opcode_dispatch:
 /* Add character to search string and continue search. */
 default:
 #if defined (HANDLE_MULTIBYTE)
-  wlen = (cxt->mb[0] == 0 || cxt->mb[1] == 0) ? 1 : RL_STRLEN (cxt->mb);
+  if (MB_CUR_MAX > 1 && rl_byte_oriented == 0)
+   wlen = (cxt->mb[0] == 0 || cxt->mb[1] == 0) ? 1 : RL_STRLEN (cxt->mb);
+  else
 #else
-  wlen = 1;
+   wlen = 1;
 #endif
   if (cxt->search_string_index + wlen + 1 >= cxt->search_string_size)
{
-- 
2.47.0




Re: [PATCH] histfile: fix mmap page alignment

2024-10-18 Thread Chet Ramey

On 10/17/24 8:27 PM, Grisha Levit wrote:

The mmap in history_do_write would usually fail when appending because
the offset must be a multiple of the page size.


Thanks for the report. Not everyone requires this, but since some systems
do, it's better to do this for portability.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: recent typo in sig.c breaks Minix compilation

2024-10-18 Thread Martin D Kealey
On Fri, 18 Oct 2024, 13:09 Oğuz,  wrote:

> On Friday, October 18, 2024, Martin D Kealey 
> wrote:
>>
>> Talking of which, I note several places where there's a construct like:
>>
>> #ifdef FOO
>> > if (foo && zot)
>> > #else
>> > if (zot)
>> > #endif
>>
>>
>> Unfortunately this confuses both the indent program and some editors.
>>
>
> It looks a lot cleaner than the alternatives you provided.
>

I would a agree that my first suggestion is untidy, even ugly; I only
provide it because it is the minimal adjustment that allows "indent" to
function properly.

However my second suggestion is arguably *cleaner* than the current style,
since it doesn't mash up an asymmetric part of a compound statement, and is
an even better improvement where the entire #if/#endif group is (currently)
repeated.

But it requires thought and can't be automated. Is there some other
objection?

What are the names of these editors?
>

Ok that wasn't the best example of code that confuses editors.

The problematic code for editors looks like this:

#ifdef FOO
if (foo && zot) {
#else
if (bar && zot) {
#endif

Vim certainly does not like this, and if you have an editor that does NOT
get confused by this, I'd like to know. (Not counting editors that entirely
lack the ability to match braces.)

To be fair, cuddled braces on compound statements are the minority.

-Martin


Re: [PATCH] histfile: fix mmap page alignment

2024-10-18 Thread Chet Ramey

On 10/18/24 12:19 PM, Grisha Levit wrote:

On Fri, Oct 18, 2024 at 12:15 PM Chet Ramey  wrote:


On 10/18/24 11:51 AM, Chet Ramey wrote:

On 10/17/24 8:27 PM, Grisha Levit wrote:

The mmap in history_do_write would usually fail when appending because
the offset must be a multiple of the page size.


Thanks for the report. Not everyone requires this, but since some systems
do, it's better to do this for portability.


Of course, the fact that nothing defines HISTORY_USE_MMAP limits the
impact somewhat.


I was just composing a question about this -- is this intentional? It looks
like it used to be behind HAVE_MMAP (which does get set) but changed soon
after.


Probably. I can't remember -- it was over 20 years ago. My guess is I
did it as an experiment and never went back to it because performance
was acceptable.

I suppose we can try it again, since everyone has mmap with more-or-less
the same behavior now.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH] histfile: fix mmap page alignment

2024-10-18 Thread Grisha Levit
On Fri, Oct 18, 2024 at 12:15 PM Chet Ramey  wrote:
>
> On 10/18/24 11:51 AM, Chet Ramey wrote:
> > On 10/17/24 8:27 PM, Grisha Levit wrote:
> >> The mmap in history_do_write would usually fail when appending because
> >> the offset must be a multiple of the page size.
> >
> > Thanks for the report. Not everyone requires this, but since some systems
> > do, it's better to do this for portability.
>
> Of course, the fact that nothing defines HISTORY_USE_MMAP limits the
> impact somewhat.

I was just composing a question about this -- is this intentional? It looks
like it used to be behind HAVE_MMAP (which does get set) but changed soon
after.

I guess I wouldn't expect much of a performance difference between mmap-
based and regular I/O for sequential access patterns like the ones here,
though some rough testing I just did showed about a 30% improvement.



Re: [PATCH] histfile: fix mmap page alignment

2024-10-18 Thread Chet Ramey

On 10/18/24 11:51 AM, Chet Ramey wrote:

On 10/17/24 8:27 PM, Grisha Levit wrote:

The mmap in history_do_write would usually fail when appending because
the offset must be a multiple of the page size.


Thanks for the report. Not everyone requires this, but since some systems
do, it's better to do this for portability.


Of course, the fact that nothing defines HISTORY_USE_MMAP limits the
impact somewhat.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH] rl_tilde_expand: avoid uninitialized memory use

2024-10-18 Thread Chet Ramey

On 10/18/24 1:14 AM, Grisha Levit wrote:

 $ bash --norc -in <<< $'\e&'

 WARNING: MemorySanitizer: use-of-uninitialized-value
 #0 rl_tilde_expand lib/readline/util.c:208:10


Thanks for the report.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: IFS whitespace definition

2024-10-18 Thread Chet Ramey

On 10/17/24 6:44 PM, Greg Wooledge wrote:

This issue came up on the Libera #bash IRC channel today:

Between bash 4.4 and 5.0, the definition of "IFS whitespace" has apparently
been expanded:


POSIX defines whitespace as a character in the current locale's `space'
character class, or a byte for which isspace() returns true. The word
splitting section references this definition, but leaves it up to the
application whether or not characters besides space/tab/newline are
considered IFS whitespace when they appear in $IFS.

At the time (previous edition of the standard), POSIX defined whitespace
as "In the POSIX locale, white space consists of one or more  (
 and  characters), , , ,
and  characters." The word splitting section wasn't quite
as rigorous as the current version's, but it referenced this definition.

However, the conformance suite tests for this.

Before bash-5.0, Oracle contacted me about the results of their running
bash-4.4 through the conformance suite (they were considering shipping
the next version of Solaris with bash as the POSIX shell and wanted it
to pass the tests). Now, I had not run bash through this test suite
myself -- that came later -- so I took them at their word.

There were a couple of `read' tests for exactly this, including making
sure that leading and trailing whitespace got stripped if the (non-
space/tab/newline) characters were in $IFS.

So I changed it -- the test suite, something that companies have to pay
to take and want to pass, was supposed to reflect the normative text --
and shipped bash-5.0. Oracle was happy, this was a minor change that
affected few people, and then Oracle canceled Solaris 12 and decided to
stick with Solaris 11 forever.



In bash 4.4 and earlier, IFS whitespace is always space + tab + newline.
But in 5.0 and later, it's "whatever the locale's isspace() allows",


Yep, as POSIX specifies.


along with some kind of 0x00 to 0x7f range check (thanks emanuele6).


The comment in locale_setblanks explains this: some systems, like macOS,
return true from isspace() for characters between 0x80 and 0xff even
though they introduce multibyte characters (every locale besides "C"
in macOS uses UTF-8 encoding). Grisha reported it:

https://lists.gnu.org/archive/html/bug-bash/2023-05/msg00132.html



Now, that's not necessarily bad, but the man page still says:


Yes, the man page and info file need to evolve in the same manner as
the standard: define whitespace and then reference it as needed.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


history-search-* and undo lists

2024-10-18 Thread Grisha Levit
There's some issue with undo list handling in history-search-* commands:

Doing a successful search with a line that has an undo list causes the
undo entries from that list to leaked:

HISTFILE= INPUTRC=/ bash --norc -in <<< $'X\nX\e[5~'
=
ERROR: LeakSanitizer: detected memory leaks
Direct leak of 32 byte(s) in 1 object(s) allocated from:
 #2 alloc_undo_entry undo.c:75:23
 #3 rl_add_undo  undo.c:92:10
 #4 rl_insert_text   text.c:114:2
 #5 _rl_insert_char  text.c:935:7
 #6 rl_inserttext.c:989:42
 #7 _rl_dispatch_subseq  readline.c:941:8
 #8 _rl_dispatch readline.c:876:10
 #9 readline_internal_char   readline.c:690:11
#10 readline_internal_charloop   readline.c:737:11
#11 readline_internalreadline.c:749:18
#12 readline readline.c:387:11


...and running the search again causes a use-after-free when accessing
prev_line_found in rl_history_search_internal:

HISTFILE= INPUTRC=/ bash --norc -in <<< $'X\nX\nX\e[5~\e[5~'
=
ERROR: AddressSanitizer: heap-use-after-free on address 0xe53d8842ae50
READ of size 1 at 0xe53d8842ae50 thread T0
 #0 rl_history_search_internal   search.c:634:30
 #1 rl_history_search_backward   search.c:746:11
 #2 _rl_dispatch_subseq  readline.c:941:8

0xe53d8842ae50 is 0 bytes in 2-byte region [0xe53d8842ae50,0xe53d8842ae52)
freed by thread T0 here:
 #2 _rl_maybe_replace_line   misc.c:344:7
 #3 rl_history_search_internal   search.c:609:3
 #4 rl_history_search_backward   search.c:746:11
 #5 _rl_dispatch_subseq  readline.c:941:8

previously allocated by thread T0 here:
 #2 alloc_history_entry  history.c:296:25
 #3 add_history  istory.c:428:10
 #4 really_add_history   bashhist.c:963:3
 #5 bash_add_history bashhist.c:948:5


...or when using revert-all-newline:

echo 'set revert-all-at-newline on' > i
HISTFILE= INPUTRC=i bash --norc -in <<< $'X\nX\e[5~\exnext-history\nX'
=
ERROR: AddressSanitizer: heap-use-after-free on address 0xef01e9428830
READ of size 8 at 0xef01e9428830 thread T0
 #0 _rl_free_undo_list   undo.c:106:16
 #1 rl_free_undo_listundo.c:122:3
 #2 readline_common_teardown readline.c:507:5
 #3 readline_internal_teardown   readline.c:518:3

0xef01e9428830 is 0 bytes in 32-byte region [0xef01e9428830,0xef01e9428850)
freed by thread T0 here:
 #2 rl_do_undo   undo.c:267:7
 #3 _rl_revert_previous_linesmisc.c:500:6
 #4 _rl_revert_all_lines misc.c:530:3
 #5 readline_common_teardown readline.c:502:5
 #6 readline_internal_teardown   readline.c:518:3

previously allocated by thread T0 here:
 #2 alloc_undo_entry undo.c:75:23
 #3 rl_add_undo  undo.c:92:10
 #4 rl_end_undo_groupundo.c:305:3
 #5 _rl_replace_text text.c:205:3
 #6 make_history_line_currentsearch.c:134:3
 #7 rl_history_search_internal   search.c:667:3
 #8 rl_history_search_backward   search.c:746:11



Re: IFS whitespace definition

2024-10-18 Thread Chet Ramey

On 10/17/24 10:33 PM, Robert Elz wrote:

 Date:Thu, 17 Oct 2024 18:44:59 -0400
 From:Greg Wooledge 
 Message-ID:  

   | Between bash 4.4 and 5.0, the definition of "IFS whitespace" has apparently
   | been expanded:

Not going to comment on bash, or its manual, specifically, but the
standard allows anything that the locale defines as in the space
category of the current locale to be IFS whitespace, if present in IFS ...
but requires that space, tab, and newline (again, if present in IFS) be
IFS whitespace, whatever the locale specifies.


The current version of the standard is the result of it relaxing the
previous edition's requirements. I don't know if the test suite still
tests for behavior with \r\b\f\v in $IFS.


Which (of any) character other than space, tab, and newline (which are in
the locale's space class) count as IFS whitespace is up to the implementation.


Now. Not during the period we're talking about. And the reason it got
relaxed is because there were implementations that didn't conform to the
previous edition's text.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature