suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Zachary Santer
bash's echo command is broken - YouTube


To restate what's in the video, you can't safely use echo to print the
contents of a variable that could be arbitrary, because the variable could
consist entirely of '-n', '-e', or '-E', and '--' is not interpreted as the
end of options, but rather, something to print.

I recognized this and replaced all of my calls to echo with printf some
time ago.

If POSIX mandates that '--' not be taken as the end of options, then the
safe thing would be to simply not have echo take any options. Obviously,
that would break backwards compatibility, so you'd want this to be optional
behavior that the shell programmer can enable if desired.

I guess, alternatively, there could be a shell option for echo to interpret
'--' as the end of options. However, this would require more work on the
part of whoever may be trying to resolve this issue in their scripts.

Just a thought.


Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-26 Thread Thomas ten Cate
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt
-fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat
-Werror=format-security -fstack-clash-protection
-fcf-protection -g
-ffile-prefix-map=/build/bash/src=/usr/src/debug/bash -flto=auto
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin'
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc'
-DSYS_BASH_LOGOUT='/etc/bash.bash_logout'
-DNON_INTERACTIVE_LOGIN_SHELLS
uname output: Linux craig 6.4.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 19
Jul 2023 19:19:38 + x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.1
Patch Level: 16
Release Status: release

Description:

In the bash manual page (`man bash`), the ASCII tilde character '~'
(0x7e) is replaced by the Unicode character '˜' (U+02DC SMALL TILDE):

$ man bash | grep 'additional binary operator'
  An additional binary operator, =˜, is available,
with the same prece‐

The same happens for the use of ~ as a shorthand for the home
directory. This makes the manual page incorrect, and difficult to
search.

It looks like there is an ASCII tilde character in the man page's source code:

$ gunzip -c /usr/share/man/man1/bash.1.gz | grep 'additional
binary operator'
An additional binary operator, \fB=~\fP, is available, with the same

I don't know the first thing about groff, but `man groff_char`
suggests that ~ is indeed rendered as "modifier tilde", and that one
should write \(ti to obtain an actual tilde character.

I'm guessing the manpage is generated from texinfo, so if this is
actually a bug in texinfo, feel free to forward this email to
bug-texinfo at gnu.org.



Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Chet Ramey

On 7/26/23 8:42 AM, Zachary Santer wrote:


If POSIX mandates that '--' not be taken as the end of options, then the
safe thing would be to simply not have echo take any options. Obviously,
that would break backwards compatibility, so you'd want this to be optional
behavior that the shell programmer can enable if desired.


set -o posix
shopt -s xpg_echo

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Greg Wooledge
On Wed, Jul 26, 2023 at 08:42:15AM -0400, Zachary Santer wrote:
> To restate what's in the video, you can't safely use echo to print the
> contents of a variable that could be arbitrary, because the variable could
> consist entirely of '-n', '-e', or '-E', and '--' is not interpreted as the
> end of options, but rather, something to print.
> 
> I recognized this and replaced all of my calls to echo with printf some
> time ago.

Yes.  Sounds like you're already aware of the issue, and already know
how to fix it in your scripts.

> If POSIX mandates that '--' not be taken as the end of options, then the
> safe thing would be to simply not have echo take any options. Obviously,
> that would break backwards compatibility, so you'd want this to be optional
> behavior that the shell programmer can enable if desired.
> 
> I guess, alternatively, there could be a shell option for echo to interpret
> '--' as the end of options. However, this would require more work on the
> part of whoever may be trying to resolve this issue in their scripts.

Everyone who does shell scripts on a regular basis is already aware of
all these issues, and the consensus is "stop using echo with variable
content".  There's really nothing more to be said at this point.
Adding more shell options wouldn't simplify anything.  It would just
add yet another permutation to the problem.



Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Chet Ramey

On 7/26/23 10:15 AM, Zachary Santer wrote:
Oh, that's weird. I just assumed that sh would be running bash with 'set -o 
posix'. Evidently, not in MSYS2. 'man sh' takes me to the Bash man page. 


Weird. I don't use MSYS2, but that's how it works on Unix/Linux systems.

When I run sh, 'set -o posix' has no effect, but it definitely makes echo 
not interpret any arguments as options when I'm in bash.


Not quite, at least on Unix/Linux/macOS. `set -o posix' by itself doesn't
affect echo's behavior with respect to accepting options or expanding
backslash-escapes in the remaining arguments. There's too much existing
code to try and bother with that. The `xpg_echo' shell option disables
option processing and enables backslash-escape translation, which is the
POSIX/XSI required behavior.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Kerin Millar
On Wed, 26 Jul 2023, at 1:42 PM, Zachary Santer wrote:
> bash's echo command is broken - YouTube
> 
>
> To restate what's in the video, you can't safely use echo to print the
> contents of a variable that could be arbitrary, because the variable could
> consist entirely of '-n', '-e', or '-E', and '--' is not interpreted as the
> end of options, but rather, something to print.
>
> I recognized this and replaced all of my calls to echo with printf some
> time ago.
>
> If POSIX mandates that '--' not be taken as the end of options, then the
> safe thing would be to simply not have echo take any options. Obviously,
> that would break backwards compatibility, so you'd want this to be optional
> behavior that the shell programmer can enable if desired.

echo() { local IFS=' '; printf '%s\n' "$*"; }

-- 
Kerin Millar



Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Zachary Santer
Oh, that's weird. I just assumed that sh would be running bash with 'set -o
posix'. Evidently, not in MSYS2. 'man sh' takes me to the Bash man page.
When I run sh, 'set -o posix' has no effect, but it definitely makes echo
not interpret any arguments as options when I'm in bash.

And I'm again not receiving Greg Wooledge's emails, so I'll just respond to
that here. None of us were born knowing how to write bash scripts, sir. For
me, that's been a combination of just doing it, reading through sections of
the manual, and of course the occasional Google search leading to Stack
Overflow. Maybe not the best way to go, but it would be nice if there
weren't as many gotchas lurking about. Maybe that's just a lost cause. Noob
me would've had to have already known about this hypothetical shell option,
which of course would not have been the case.

Eh, well. Thanks, guys.

On Wed, Jul 26, 2023 at 9:24 AM Chet Ramey  wrote:

> On 7/26/23 8:42 AM, Zachary Santer wrote:
>
> > If POSIX mandates that '--' not be taken as the end of options, then the
> > safe thing would be to simply not have echo take any options. Obviously,
> > that would break backwards compatibility, so you'd want this to be
> optional
> > behavior that the shell programmer can enable if desired.
>
> set -o posix
> shopt -s xpg_echo
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>
>


Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-26 Thread Chet Ramey

On 7/26/23 4:47 AM, Thomas ten Cate wrote:


Bash Version: 5.1
Patch Level: 16
Release Status: release

Description:

In the bash manual page (`man bash`), the ASCII tilde character '~'
(0x7e) is replaced by the Unicode character '˜' (U+02DC SMALL TILDE):

 $ man bash | grep 'additional binary operator'
   An additional binary operator, =˜, is available,
with the same prece‐

The same happens for the use of ~ as a shorthand for the home
directory. This makes the manual page incorrect, and difficult to
search.


Thanks for the report. Progress, I suppose.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-26 Thread G. Branden Robinson
Hi Thomas,

At 2023-07-26T10:47:05+0200, Thomas ten Cate wrote:
> In the bash manual page (`man bash`), the ASCII tilde character '~'
> (0x7e) is replaced by the Unicode character '˜' (U+02DC SMALL TILDE):
> 
> $ man bash | grep 'additional binary operator'
>   An additional binary operator, =˜, is available,
> 
> The same happens for the use of ~ as a shorthand for the home
> directory. This makes the manual page incorrect, and difficult to
> search.
> 
> It looks like there is an ASCII tilde character in the man page's
> source code:
> 
> $ gunzip -c /usr/share/man/man1/bash.1.gz | grep 'additional
> binary operator'
> An additional binary operator, \fB=~\fP, is available, with the same
> 
> I don't know the first thing about groff, but `man groff_char`
> suggests that ~ is indeed rendered as "modifier tilde", and that one
> should write \(ti to obtain an actual tilde character.

I know a little about groff.  Your advice is fine for man pages that
target only groff[1] and/or mandoc[2], but not Heirloom Doctools
troff,[3] neatroff[4] or Plan 9 troff (in its original form or as
maintained in Plan 9 from User Space[5]), and not legacy implementations
descended from AT&T troff that are, as far as I can tell, unmaintained
by the few Unix System V vendors that still exist.[6][7]

Many projects don't need to worry about such extreme portability in
their man pages, but GNU Bash arguably does.  (I'm open to correction.)

Furthermore, in the *roff language itself, as originally implemented by
Joe Ossanna (and re-implemented by Brian Kernighan) there is no good
way to test for the existence of a special character.[8]

As a first stab at it, I'd divide the world into two camps: (a) groff
and mandoc(1), and (b) everything else, and not worry about (b).

The bash(1) man page has an extensive preamble already that still
includes a workaround for 4.3BSD(!), so adding a little bit to it to
accommodate systems developed since 1990 might not be too disruptive.

I'm attaching a straw man diff to the bash(1) page.  If Chet likes it,
I'm happy to prepare one against the bash devel branch.

bash(1) also attempts to select a font named "CW" in places, which is
another portability problem (it's a Unix System III [and later] troff
font name that was available on _some_ output devices).  But I'd like to
see how we get over this bridge before I try to cross that one.  :)

> I'm guessing the manpage is generated from texinfo, so if this is
> actually a bug in texinfo, feel free to forward this email to
> bug-texinfo at gnu.org.

I don't think that's actually true.  As far as I know, Chet maintains
Bash's Texinfo docs and man pages in parallel by hand.

Regards,
Branden

[1] https://www.gnu.org/software/groff/
[2] https://mandoc.bsd.lv/
[3] https://github.com/n-t-roff/heirloom-doctools
[4] https://github.com/aligrudi/neatroff
[5] https://github.com/9fans/plan9port

[6] HP-UX 11 appears to still ship an AT&T/DWB or System V troff.
Solaris 10 does, but it is nearing end-of-life and Solaris 11
replaced its troff (of similar lineage as HP-UX's) with groff.

[7] It is also not hard to make AT&T-descended troffs support the
`ha` and `ti` special characters.  For instance, here's a patch to
Documenter's Workbench (DWB) 3.3 troff's "Latin1" output device.

--- R.orig  2023-07-26 09:55:30.527340674 -0500
+++ R   2023-07-26 09:58:49.658662373 -0500
@@ -68,6 +68,7 @@
 bs "
 ]  33  3   93
 ^  33  2   147
+ha "
 ---47  2   94
 ---50  1   95
 `  33  2   96
@@ -101,6 +102,7 @@
 ---20  2   124
 }  48  3   125
 ~  33  2   148
+ti "
 ---54  0   126
 \` 33  2   145
 ga "

But even after 30+ years since groff emerged on the scene, I'm not
aware of a single such troff having done this.

[8] A clever *roff hacker could try using the output comparison operator
and width computation escape sequence to measure of a candidate
special character, but this would not be reliable.  The output
drivers of AT&T device-independent troff appear to format
unrecognized characters as blanks (putting horizontal motions on the
output).  (groff does not, throwing an error diagnostic instead.)[9]
But if a special character did exist and happened to be the same
width as such a blank character, this test would produce a false
negative.  Worse, on nroff-mode devices, including the terminal
emulators that 99% of all man page reading is done, _all_ glyphs are
the same width, so you'd get false negatives all the time.

[9] This is a groff/AT&T troff difference that I don't think is
documented by groff.  Maybe I should fix that.
--- bash.1.orig	2023-07-26 10:19:18.770924818 -0500
+++ bash.1	2023-07-26 10:22:48.554457262 -0500
@@ -26,6 +26,22 @@
 .if !rzY .nr zY 0 \" avoid a warning about an undefined register
 .if \n(zZ=1 .ig zZ
 .if \n(zY=1 .ig zY
+.

[PATCH] read: non-raw-mode fixes

2023-07-26 Thread Grisha Levit
This patches addresses a few issues with `read' when not in raw mode.

If the last character read was an (unescaped) backslash, store it as
such instead of as a CTLESC.  Avoids:

$ printf '\\' | { read; echo "${REPLY@Q}"; }
bash: DEBUG warning: dequote_string: string with bare CTLESC
$'\001'

If an escaped null byte is read, skip it as we do with an unescaped one,
instead of adding it to input_string.  Avoids:

$ printf 'A\\\0B\nC\n' | while read; do echo "${REPLY@Q}"; done
'A'
'C'

$ printf '\\\0' | { read; echo "${REPLY@Q}"; }
bash: DEBUG warning: dequote_string: string with bare CTLESC
$'\001' # even after fix for first issue

If IFS contains \177 and the input consists of only backslash-newline
pairs and a sole \177, prevent the bare CTLNUL from being turned into an
empty string.  Avoids:

$ printf '\\\n\177' | { IFS=$'\177' read; echo "${REPLY@Q}"; }
''
---
 builtins/read.def | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/builtins/read.def b/builtins/read.def
index 5b2621fe..84878699 100644
--- a/builtins/read.def
+++ b/builtins/read.def
@@ -404,7 +404,7 @@ read_builtin (WORD_LIST *list)
   input_string[0] = '\0';

   pass_next = 0;   /* Non-zero signifies last char was backslash. */
-  saw_escape = 0;  /* Non-zero signifies that we saw an escape char */
+  saw_escape = 0;  /* Index+1 of when we last saw an escape char. */

   /* More input and options validation */
   if (nflag == 1 && nchars == 0)
@@ -751,11 +751,11 @@ read_builtin (WORD_LIST *list)
   if (pass_next)
{
  pass_next = 0;
- if (c == '\n')
+ if (c == '\n' || c == '\0')
{
  if (skip_ctlesc == 0 && i > 0)
i--;/* back up over the CTLESC */
- if (interactive && input_is_tty && raw == 0)
+ if (interactive && input_is_tty && raw == 0 && c == '\n')
print_ps2 = 1;
}
  else
@@ -769,8 +769,8 @@ read_builtin (WORD_LIST *list)
  pass_next++;
  if (skip_ctlesc == 0)
{
- saw_escape++;
  input_string[i++] = CTLESC;
+ saw_escape=i;
}
  continue;
}
@@ -783,8 +783,8 @@ read_builtin (WORD_LIST *list)

   if ((skip_ctlesc == 0 && c == CTLESC) || (skip_ctlnul == 0 && c
== CTLNUL))
{
- saw_escape++;
  input_string[i++] = CTLESC;
+ saw_escape=i;
}

 add_char:
@@ -825,6 +825,12 @@ add_char:
   if (nchars > 0 && nr >= nchars)
break;
 }
+
+  if (i && saw_escape == i && input_string[i-1] == CTLESC)
+input_string[i-1] = '\\';  /* Preserve trailing backslash */
+  else if (skip_ctlnul && i == 1 & saw_escape == 1 && input_string[0]
== CTLNUL)
+saw_escape = 0;/* Avoid dequoting bare CTLNUL */
+
   input_string[i] = '\0';
   check_read_timeout ();

-- 
2.41.0


0001-read-non-raw-mode-fixes.patch
Description: Binary data


Re: [PATCH] fix bind -X quoting

2023-07-26 Thread Chet Ramey

On 7/24/23 1:13 PM, Chet Ramey wrote:


You could do it if you allowed, say

bind -x '"\eX": \"command with spaces\" \"x\"'

and then stripped the backslashes before calling rl_generic_bind, but
that's not exactly backwards compatible either.


Thinking about it some more, you can do it like this:

bind -x $'"\\eX": \'"command with spaces" "x"\''

since bind -x allows single-quoted strings as the command to execute,
and $'...' allows backslash-escaped single quotes.

If we ran the command string through rl_translate_keyseq, it would allow
backslash-escaped double quotes and strip the backslashes, but you get
the rest of the backslash processing that you probably don't want.

It's just not transitive.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




built-in printf returns success when integer is out of range

2023-07-26 Thread thomas
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat
-Werror=format-security -Wall
uname output: Linux fnord42 6.1.25-1rodete1-amd64 #1 SMP
PREEMPT_DYNAMIC Debian 6.1.25-1rodete1 (2023-05-11) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.2
Patch Level: 15
Release Status: release

Description:
printf '%d\n' 111 && echo success
prints "success"
/usr/bin/printf does not, but instead returns EXIT_FAILURE (1).

Repeat-By:
Run:
printf '%d\n' 111 && echo success



Re: built-in printf returns success when integer is out of range

2023-07-26 Thread Dennis Williamson
On Wed, Jul 26, 2023, 3:40 PM  wrote:

> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat
> -Werror=format-security -Wall
> uname output: Linux fnord42 6.1.25-1rodete1-amd64 #1 SMP
> PREEMPT_DYNAMIC Debian 6.1.25-1rodete1 (2023-05-11) x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
>
> Bash Version: 5.2
> Patch Level: 15
> Release Status: release
>
> Description:
> printf '%d\n' 111 && echo success
> prints "success"
> /usr/bin/printf does not, but instead returns EXIT_FAILURE (1).
>
> Repeat-By:
> Run:
> printf '%d\n' 111 && echo success
>
>


You didn't include all the output. It's treated as a warning rather than an
error. That's why an error status isn't set.


Re: built-in printf returns success when integer is out of range

2023-07-26 Thread thomas
On Wed, 26 Jul 2023 21:52:32 +0100, Dennis Williamson
 said:
>> Configuration Information [Automatically generated, do not change]:
>> Machine: x86_64
>> OS: linux-gnu
>> Compiler: gcc
>> Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat
>> -Werror=format-security -Wall
>> uname output: Linux fnord42 6.1.25-1rodete1-amd64 #1 SMP
>> PREEMPT_DYNAMIC Debian 6.1.25-1rodete1 (2023-05-11) x86_64 GNU/Linux
>> Machine Type: x86_64-pc-linux-gnu
>>
>> Bash Version: 5.2
>> Patch Level: 15
>> Release Status: release
>>
>> Description:
>> printf '%d\n' 111 && echo success
>> prints "success"
>> /usr/bin/printf does not, but instead returns EXIT_FAILURE (1).
>>
>> Repeat-By:
>> Run:
>> printf '%d\n' 111 && echo success
>>
>>
> You didn't include all the output. It's treated as a warning rather than an
> error. That's why an error status isn't set.

Ah indeed, the stderr message with bash builtin has the word
"warning", whereas the Linux coreutils one does not[1].

Regardless, I do not believe this should be a warning, and instead
that coreutils is doing the right thing, and bash is not.

The builtin was given a request to print an integer. It failed to do
that, because of invalid input. It's wrong to silently do the wrong
thing. I say silently, but what I mean is without a way to handle
the error, not that stderr didn't get any warning.

Scripts can be unattended, and they should fail if they fail, not
continue with best effort.

Not that 2^63-1 is any kind of best effort.

Bash built-in printf should fail (EXIT_FAILURE) on the integer out of
range just like it does fail given something that's not a number at
all.

[1]

$ printf '%d\n' 111 && echo yes || echo no
bash: printf: warning: 111: Numerical result out of range
9223372036854775807
yes

$ /usr/bin/printf '%d\n' 111 && echo yes || echo no
/usr/bin/printf: ‘111’: Numerical result out of range
9223372036854775807
no

--
typedef struct me_s {
  char name[]  = { "Thomas Habets" };
  char email[] = { "tho...@habets.se" };
  char kernel[]= { "Linux" };
  char *pgpKey[]   = { "http://www.habets.pp.se/pubkey.txt"; };
  char pgp[] = { "9907 8698 8A24 F52F 1C2E  87F6 39A4 9EEA 460A 0169" };
  char coolcmd[]   = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;



Re: [PATCH] fix bind -X quoting

2023-07-26 Thread Grisha Levit
On Wed, Jul 26, 2023, 16:06 Chet Ramey  wrote:
>
> On 7/24/23 1:13 PM, Chet Ramey wrote:
>
> > You could do it if you allowed, say
> >
> > bind -x '"\eX": \"command with spaces\" \"x\"'
> >
> > and then stripped the backslashes before calling rl_generic_bind, but
> > that's not exactly backwards compatible either.
>
> Thinking about it some more, you can do it like this:
>
> bind -x $'"\\eX": \'"command with spaces" "x"\''
>
> since bind -x allows single-quoted strings as the command to execute,
> and $'...' allows backslash-escaped single quotes.
>
> If we ran the command string through rl_translate_keyseq, it would allow
> backslash-escaped double quotes and strip the backslashes, but you get
> the rest of the backslash processing that you probably don't want.
>
> It's just not transitive.


Another issue I didn't think of with printing the unquoted translated
command is that it can include newlines, which is a problem since you
have to read the `bind -X' output one line at a time to reuse it with
`bind -x'.

If there isn't a backwards compatible way to produce output that is
reusable given the current input format, I wonder if we can leverage a
format that's not currently valid as input.  `bind -x' currently
requires a colon following the key sequence but we could change it to
also allow input without it and use rl_macro_bind instead of
rl_generic_bind if we get such input.  If we have `bind -X' produce
untranslated output as it did before, but without the `:', everything
should match up and existing valid `bind -X' commands will be
unaffected.
---
diff --git a/bashline.c b/bashline.c
index 5dac2e9e..9d99c536 100644
--- a/bashline.c
+++ b/bashline.c
@@ -4702,7 +4702,7 @@ bind_keyseq_to_unix_command (char *line)
 {
   Keymap kmap, cmd_xmap;
   char *kseq, *value;
-  int i, kstart;
+  int i, kstart, translate;

   kmap = rl_get_keymap ();

@@ -4716,16 +4716,13 @@ bind_keyseq_to_unix_command (char *line)
   /* Create the key sequence string to pass to rl_generic_bind */
   kseq = substring (line, kstart, i);

-  for ( ; line[i] && line[i] != ':'; i++)
+  /* Advance to the colon (:) or whitespace which separates the two objects. */
+  for ( ; line[i] && line[i] != ':' && line[i] != ' ' && line[i] != '\t'; i++)
 ;
-  if (line[i] != ':')
-{
-  builtin_error (_("%s: missing colon separator"), line);
-  FREE (kseq);
-  return -1;
-}

-  i = isolate_sequence (line, i + 1, 0, &kstart);
+  translate = (line[i] != ':');
+
+  i = isolate_sequence (line, i + 1, translate, &kstart);
   if (i < 0)
 {
   FREE (kseq);
@@ -4737,7 +4734,10 @@ bind_keyseq_to_unix_command (char *line)

   /* Save the command to execute and the key sequence in the CMD_XMAP */
   cmd_xmap = get_cmd_xmap_from_keymap (kmap);
-  rl_generic_bind (ISMACR, kseq, value, cmd_xmap);
+  if (translate)
+rl_macro_bind (kseq, value, cmd_xmap);
+  else
+rl_generic_bind (ISMACR, kseq, value, cmd_xmap);

   /* and bind the key sequence in the current keymap to a function that
  understands how to execute from CMD_XMAP */
diff --git a/lib/readline/bind.c b/lib/readline/bind.c
index dc30dd84..9d4817a3 100644
--- a/lib/readline/bind.c
+++ b/lib/readline/bind.c
@@ -2861,18 +2861,12 @@ _rl_macro_dumper_internal (int print_readably,
Keymap map, char *prefix)
{
case ISMACR:
  keyname = _rl_get_keyname (key);
- if (print_readably < 0)
-   out = savestring ((char *)map[key].function);
- else
-   out = _rl_untranslate_macro_value ((char *)map[key].function, 0);
+ out = _rl_untranslate_macro_value ((char *)map[key].function, 0);

- if (print_readably < 0)
-   fprintf (rl_outstream, "\"%s%s\": %s\n", prefix ? prefix : "",
-keyname,
-out ? out : "");
- else if (print_readably > 0)
-   fprintf (rl_outstream, "\"%s%s\": \"%s\"\n", prefix ? prefix : "",
+ if (print_readably)
+   fprintf (rl_outstream, "\"%s%s\"%s \"%s\"\n", prefix ? prefix : "",
 keyname,
+print_readably > 0 ? 
":" : "",
 out ? out : "");
  else
fprintf (rl_outstream, "%s%s outputs %s\n", prefix ? prefix : "",



Re: built-in printf returns success when integer is out of range

2023-07-26 Thread Lawrence Velázquez
> On Jul 26, 2023, at 5:14 PM, tho...@habets.se wrote:
> 
> On Wed, 26 Jul 2023 21:52:32 +0100, Dennis Williamson
>  said:
>> You didn't include all the output. It's treated as a warning rather than an
>> error. That's why an error status isn't set.
> 
> Ah indeed, the stderr message with bash builtin has the word
> "warning", whereas the Linux coreutils one does not[1].
> 
> Regardless, I do not believe this should be a warning, and instead
> that coreutils is doing the right thing, and bash is not.

POSIX.1-2017 seems to agree [1].  The printf specification states:

If an _argument_ operand cannot be completely converted
into an internal value appropriate to the corresponding
conversion specification, a diagnostic message shall be
written to standard error and the utility shall not exit
with a zero exit status, but shall continue processing any
remaining operands and shall write the value accumulated
at the time the error was detected to standard output.

The nonnormative "Application Usage" section elaborates:

If an argument cannot be parsed correctly for the corresponding
conversion specification, the _printf_ utility is required
to report an error.  Thus, overflow and extraneous characters
at the end of an argument being used for a numeric conversion
shall be reported as errors.

A recent austin-group-l discussion seems to concur as well [2].
(Full message forwarded below, if you don't want to follow the URL.)

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
[2] https://www.mail-archive.com/austin-group-l@opengroup.org/msg11409.html

-- 
vq



Begin forwarded message:

> From: "Geoff Clare via austin-group-l at The Open Group" 
> 
> Subject: Re: out-of-bounds numbers in shell utility arguments
> Date: June 26, 2023 at 4:49:45 AM EDT
> To: austin-grou...@opengroup.org
> Resent-From: austin-grou...@opengroup.org
> Resent-To: austin-grou...@opengroup.org
> Reply-To: Geoff Clare 
> 
> Thorsten Glaser wrote, on 24 Jun 2023:
>> 
>> what’s the POSIX mode behaviour expected when scripts attempt to
>> use overlong numbers in arguments e.g. to utilities (but possibly
>> anywhere in XSH)?
>> 
>> Say a script has on a 64-bit system:
>> 
>> test 2 -lt 
>> 
>> I found “1.1.2 Concepts Derived from the ISO C Standard” in XSH
>> Introduction, but that just says it should be signed long.
> 
> XCU 1.1.2 relates to utilities that "perform complex data manipulation
> using their own procedure and arithmetic languages".  So it applies to
> shell arithmetic expansion, but isn't really relevant to simple
> argument parsing by a utility.  For that, the relevant text is in XBD
> 12.1 Utility Argument Syntax, item 6.  This specifies ranges that must
> be "syntactically recognized as numeric values" and then says "Ranges
> greater than those listed here are allowed."
> 
> So the allowed behaviours are that either the utility syntactically
> recognises the argument as a numeric value or it doesn't.
> 
> If it doesn't, then it must report this as a syntax error.
> If it does, then its behaviour must be as described by the standard
> for the value that was recognised.
> 
>> So, is it:
>> 
>> • application error (the script writer is at fault, and the shell
>>  can do what it wants but should be consistent)
>> 
>> • unspecified behaviour (the shell can do as it wants but should
>>  be consistent); I really hope not C-level UB
>> 
>> • the utility or shell must detect this, while parsing the argument
>>  as number, erroring out
>> 
>> I’d hope for one of the first two because having wraparound semantics
>> is one of the guarantees for script writers I have in mksh for shell
>> arithmetics (not yet explicitly in the test(1) builtin).
> 
> It's somewhere between the second and third. It's unspecified whether
> the utility will report an error, but if it doesn't then it has to
> handle the value correctly, i.e. test 2 -lt  must
> exit with status 0 or status >1; it must not exit with status 1.
> 
> -- 
> Geoff Clare 
> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
> 



[PATCH v2] read: non-raw-mode fixes

2023-07-26 Thread Grisha Levit
On Wed, Jul 26, 2023 at 2:23 PM Grisha Levit  wrote:
> If the last character read was an (unescaped) backslash, store it as
> such instead of as a CTLESC.  Avoids:
>
> $ printf '\\' | { read; echo "${REPLY@Q}"; }
> bash: DEBUG warning: dequote_string: string with bare CTLESC
> $'\001'

Sorry that was wrong, the CTLESC should just be removed.  Much simpler
patch below.
---
diff --git a/builtins/read.def b/builtins/read.def
index 5b2621fe..ce5bcc38 100644
--- a/builtins/read.def
+++ b/builtins/read.def
@@ -751,11 +751,11 @@ read_builtin (WORD_LIST *list)
   if (pass_next)
{
  pass_next = 0;
- if (c == '\n')
+ if (c == '\n' || c == '\0')
{
  if (skip_ctlesc == 0 && i > 0)
i--;/* back up over the CTLESC */
- if (interactive && input_is_tty && raw == 0)
+ if (interactive && input_is_tty && raw == 0 && c == '\n')
print_ps2 = 1;
}
  else
@@ -825,6 +825,11 @@ add_char:
   if (nchars > 0 && nr >= nchars)
break;
 }
+
+  if (pass_next && skip_ctlesc == 0)
+i--;   /* back up over the CTLESC */
+  if (skip_ctlnul && saw_escape && i == 1 && input_string[0] == CTLNUL)
+saw_escape = 0;/* Avoid dequoting bare CTLNUL */
   input_string[i] = '\0';
   check_read_timeout ();



Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Zachary Santer
I managed to set xpg_echo on in bash and then forget that I did that. I was
using echo's behavior with -n to determine if set -o posix had taken
effect, when it's completely unrelated. And sh in MSYS2 is definitely just
bash with set -o posix on. What I get for rushing.

However, the man page for bash 5.2 only says the following about xpg_echo,
in the section about shopt:
"If set, the echo builtin expands backslash-escape sequences by default."

It makes no mention of disabling option processing.

Similarly, in the description of the echo command, we see:
"The xpg_echo shell option may be used to dynamically determine whether or
not echo expands these escape characters by default."

So is that a bug in the documentation?

On Wed, Jul 26, 2023 at 10:51 AM Chet Ramey  wrote:

> On 7/26/23 10:15 AM, Zachary Santer wrote:
> > Oh, that's weird. I just assumed that sh would be running bash with 'set
> -o
> > posix'. Evidently, not in MSYS2. 'man sh' takes me to the Bash man page.
>
> Weird. I don't use MSYS2, but that's how it works on Unix/Linux systems.
>
> > When I run sh, 'set -o posix' has no effect, but it definitely makes
> echo
> > not interpret any arguments as options when I'm in bash.
>
> Not quite, at least on Unix/Linux/macOS. `set -o posix' by itself doesn't
> affect echo's behavior with respect to accepting options or expanding
> backslash-escapes in the remaining arguments. There's too much existing
> code to try and bother with that. The `xpg_echo' shell option disables
> option processing and enables backslash-escape translation, which is the
> POSIX/XSI required behavior.
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>
>


Re: suggestion: shell option for echo to not interpret any argument as an option

2023-07-26 Thread Zachary Santer
On Wed, Jul 26, 2023 at 10:25 AM Kerin Millar wrote:
> echo() { local IFS=' '; printf '%s\n' "$*"; }

There's a simple solution. Ha. Thank you.