printf %ls conversion

2023-07-17 Thread Grisha Levit
The mbsrtowcs call here doesn't convert the final \0 of mbs, leaving the
final byte of ws as whatever was just malloc-ed.

Noticed in an ASAN build which makes sure that this is never L'\0'.  Oddly,
this didn't actually trigger an ASAN report, just saw that nothing was
getting printed by printf.
---
diff --git a/builtins/printf.def b/builtins/printf.def
index ad4f4d12..62820514 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -1493,7 +1493,7 @@ getwidestr (size_t *lenp)
   mbs = garglist->word->word;
   slen = strlen (mbs);
   ws = (wchar_t *)xmalloc ((slen + 1) * sizeof (wchar_t));
- mblength = mbsrtowcs (ws, &mbs, slen, &state);
+ mblength = mbsrtowcs (ws, &mbs, slen + 1, &state);
   if (lenp)
 *lenp = mblength;


Docs: "shell-transpose-words" in "8.4.4 Killing And Yanking"

2023-07-17 Thread Sebastian Carlos
Hello,

The "shell-transpose-words" command is documented in the section "8.4.4
Killing And Yanking", but I think it should be in "8.4.3 Commands For
Changing
Text".

That way, just like the other "shell-*" commands, it would be documented
next
to its counterpart "transpose-words".

Thanks.


Re: printf %ls conversion

2023-07-17 Thread Chet Ramey

On 7/17/23 3:19 AM, Grisha Levit wrote:

The mbsrtowcs call here doesn't convert the final \0 of mbs, leaving the
final byte of ws as whatever was just malloc-ed.


Thanks for the report.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: [PATCH] print regerror string on regcomp error

2023-07-17 Thread Chet Ramey

On 7/12/23 12:01 PM, Grisha Levit wrote:

Since bash-5.3 now shows an error message when a regular expression
can't be compiled, I thought it might be useful to add the
regerror()-supplied string that provides more specifics on the
failure, so we can get messages like:

$ [[ x =~ [z-a] ]]
bash: [[: invalid regular expression `[z-a]': invalid character range


Thanks for the patch. I guess we'll see how much the text of regerror(3)
error messages varies across regexp implementations.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: [PATCH] print regerror string on regcomp error

2023-07-17 Thread Grisha Levit
On Mon, Jul 17, 2023, 12:09 Chet Ramey  wrote:

> I guess we'll see how much the text of regerror(3)
> error messages varies across regexp implementations.
>

Is that a concern? The GNU strings [1] are translated and even their
untranslated text does differ from that of the BSD ones [2].

The set of errors is also different, e.g. REG_EMPTY is only in BSD because
its regcomp() does not accept `()' while GNU's does.  But it's precisely
this difference that I think would help clue users into why their scripts
work on one platform and not another.


[1]:
https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/regcomp.c;h=12650714c06890d99a6a7aaa96437a2f83cc973b;hb=HEAD#l132

[2]:
https://cgit.freebsd.org/src/tree/lib/libc/regex/regerror.c?id=8a16b7a18f5d0b031f09832fd7752fba717e2a97#n90


Re: [PATCH] print regerror string on regcomp error

2023-07-17 Thread Chet Ramey

On 7/17/23 1:10 PM, Grisha Levit wrote:


On Mon, Jul 17, 2023, 12:09 Chet Ramey > wrote:


I guess we'll see how much the text of regerror(3)
error messages varies across regexp implementations.


Is that a concern? The GNU strings [1] are translated and even their 
untranslated text does differ from that of the BSD ones [2].


Not particularly a concern -- it's mostly for the test suite. We have the
same problem with the error messages produced by strerror(3) and there are
existing warnings for those.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: [PATCH] normalization tweaks for macOS

2023-07-17 Thread Chet Ramey

On 7/7/23 5:05 PM, Grisha Levit wrote:

A few small tweaks for the macOS-specific normalization handling to
handle the issues below:


The issue is that the behavior has to be different between cases where
the shell is reading input from the terminal and gets NFC characters
that need to be converted to NFD (which is how HFS+ and APFS store them)
and when the shell is reading input from a file and doesn't need to (and
should not) do anything with NFD characters.

https://lists.gnu.org/archive/html/bug-bash/2009-07/msg00047.html from 2009
is the detailed report that inspired the current interactive-only solution
for completion, followed up with

https://lists.gnu.org/archive/html/bug-bash/2010-01/msg00017.html

Does iconv work when taking NFD input that came from the file system and
trying to convert it to NFD (UTF-8-MAC)? I've honestly never checked.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Docs: "shell-transpose-words" in "8.4.4 Killing And Yanking"

2023-07-17 Thread Chet Ramey

On 7/17/23 4:06 AM, Sebastian Carlos wrote:

Hello,

The "shell-transpose-words" command is documented in the section "8.4.4
Killing And Yanking", but I think it should be in "8.4.3 Commands For
Changing
Text".

That way, just like the other "shell-*" commands, it would be documented
next
to its counterpart "transpose-words".


Thanks for the suggestion, it's reasonable.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Some Readline variables are not available in "bind -v"

2023-07-17 Thread Chet Ramey

On 7/16/23 9:25 AM, Sebastian Carlos wrote:

Hello,

I noticed that the following variables are not visible in "bind -v":
- active-region-start-color
- active-region-end-color


Thanks for the report. These two variables aren't reported by `bind -v'
if they have a value, so that needs to be fixed. However:

These variables don't have a default value and won't be reported until you
set them to something. The display code uses whatever the terminfo entry
reports for the terminal's so/se capabilities if they aren't set.


- isearch-terminators


Similarly, this variable doesn't have a default value and isn't reported
until you set it to something. The isearch code uses a default value if
it's not set.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: history-size reported as "0" even is value is "-1"

2023-07-17 Thread Chet Ramey

On 7/16/23 7:28 AM, Sebastian Carlos wrote:

Hello,

history-size is documented as follows:

If set to a value less than zero, the number of history entries is not

limited.

But after setting it to "-1", for example with this line in .inputrc:

set history-size -1


It is still reported as "0" in "bind -v":

set history-size 0


Thanks for the report. The right thing happens either way, but readline
should report the value as < 0 so the output of `bind -v' can be reused
as input.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: [PATCH] normalization tweaks for macOS

2023-07-17 Thread Grisha Levit
On Mon, Jul 17, 2023 at 3:29 PM Chet Ramey  wrote:
>
> On 7/7/23 5:05 PM, Grisha Levit wrote:
> > A few small tweaks for the macOS-specific normalization handling to
> > handle the issues below:
>
> The issue is that the behavior has to be different between cases where
> the shell is reading input from the terminal and gets NFC characters
> that need to be converted to NFD (which is how HFS+ and APFS store them)
> and when the shell is reading input from a file and doesn't need to (and
> should not) do anything with NFD characters.

NB: while HFS+ stores NFD names, APFS preserves normalization, so we
can get either NFC or NFD text back from readdir.  Both are
normalization-insensitive: "Being normalization-insensitive ensures
that normalization variants of a filename cannot be created in the
same directory, and that a filename can be found with any of its
normalization variants." [1]

Currently, Bash never actually converts to NFD.  The fnx_tofs()
function is there but it is never used.  Instead, Bash converts
filenames to NFC with fnx_fromfs() before comparing with either the
glob pattern or the completion hint text (which is never converted).

Since access is normalization-insensitive, we just need to normalize
to _some_ form, so going to NFC is fine, but if we're going to do that
we should normalize both the filesystem name and the text being
compared.

If there's a match, globs expand to the filenames (NFC or NFD) as
returned by readdir(), and Readline completes with NFC-normalized
versions of the names.  I think this makes sense.

What doesn't work quite right currently though is that glob patterns
with NFD text never match anything, and completion prefixes with NFD
text never expand to anything.

[1]: 
https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html

> Does iconv work when taking NFD input that came from the file system and
> trying to convert it to NFD (UTF-8-MAC)? I've honestly never checked.

Converting to UTF-8-MAC always normalizes to NFD:

$ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8-MAC | od -b -An
  145 314 201 000 145 314 201

$ printf '\303\251\0\145\314\201' | iconv -f UTF-8 -t UTF-8-MAC | od -b -An
  145 314 201 000 145 314 201

But Bash only converts from UTF-8-MAC to UTF-8, which always normalizes to NFC:

$ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8 | od -b -An
  303 251 000 303 251



[PATCH] fix bind -X quoting

2023-07-17 Thread Grisha Levit
The output of `bind -X' is not reusable if the bound command has quotes,
backslashes, etc.

$ bind -x '"\eX": echo "x"'
$ bind -X
"\eX": "echo \"x\""
$ bind -x "$(bind -X)"
$ bind -X
"\eX": "echo \\\"x\\\""

This patch changes rl_macro_dumper to not untranslate the macro body
when passed a negative print_readably argument.  This is technically an
API change, so maybe not the best fix, though I doubt it will impact any
real usage.
---
 bashline.c   | 2 +-
 lib/readline/bind.c  | 6 +-
 lib/readline/doc/rltech.texi | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/bashline.c b/bashline.c
index 2fb00e82..e2b329f9 100644
--- a/bashline.c
+++ b/bashline.c
@@ -4583,7 +4583,7 @@ print_unix_command_map (void)
   save = rl_get_keymap ();
   cmd_xmap = get_cmd_xmap_from_keymap (save);
   rl_set_keymap (cmd_xmap);
-  rl_macro_dumper (1);
+  rl_macro_dumper (-1);
   rl_set_keymap (save);
   return 0;
 }
diff --git a/lib/readline/bind.c b/lib/readline/bind.c
index ee6d6e17..bae0e6cc 100644
--- a/lib/readline/bind.c
+++ b/lib/readline/bind.c
@@ -2861,7 +2861,11 @@ _rl_macro_dumper_internal (int print_readably,
Keymap map, char *prefix)
{
case ISMACR:
  keyname = _rl_get_keyname (key);
- out = _rl_untranslate_macro_value ((char *)map[key].function, 0);
+
+ if (print_readably < 0)
+   out = savestring ((char *)map[key].function);
+ else
+   out = _rl_untranslate_macro_value ((char *)map[key].function, 0);

  if (print_readably)
fprintf (rl_outstream, "\"%s%s\": \"%s\"\n", prefix ? prefix : "",
diff --git a/lib/readline/doc/rltech.texi b/lib/readline/doc/rltech.texi
index 4d3f3cca..83a34a5f 100644
--- a/lib/readline/doc/rltech.texi
+++ b/lib/readline/doc/rltech.texi
@@ -1354,8 +1354,10 @@ use @code{rl_generic_bind()} instead.
 @deftypefun void rl_macro_dumper (int readable)
 Print the key sequences bound to macros and their values, using
 the current keymap, to @code{rl_outstream}.
-If @var{readable} is non-zero, the list is formatted in such a way
+If @var{readable} is greater than zero, the list is formatted in such a way
 that it can be made part of an @code{inputrc} file and re-read.
+If @var{readable} is less than zero, the macros are printed in "translated"
+form.
 @end deftypefun

 @deftypefun int rl_variable_bind (const char *variable, const char *value)
-- 
2.41.0