Re: Surprising behavior with inline environment variable expansion

2025-04-04 Thread Chet Ramey

On 3/30/25 2:34 AM, Robert Elz wrote:

 Date:Thu, 27 Mar 2025 17:22:03 -0400
 From:Chet Ramey 
 Message-ID:  <6da17a73-2aac-4fa5-9fa7-5bfff087d...@case.edu>

   | The shell should assume that setting a shell variable means the
   | user wants to modify the shell's locale settings.

Yes, of course, the question is exactly what the user wants &/or what
the shell should impose on the user.

   | > One answer would be "nothing"
   | A bad choice.

I agree, though that would make the shell operate the way just
about all other programs behave, for example, if one does

ENVIRON["LC_NUMERIC"] = "whatever"

in awk, while that should appear in the environment of something
awk runs for a system() operation, it doesn't affect the way awk
works, or its numeric input or output, any way at all.

   | The shell does quite a lot of things that are different than any other
   | application, including allowing the user to change locale environment
   | variables.

Yes, but it is not entirely alone in that.  awk/make/env/... allow the
environment to be altered, but none of them notice that it is a locale
environment variable that is being altered, and adapt their own behaviour
to match, or none that I'm aware of, so having shells do the same would
not necessarily be all that remarkable, even if it is perhaps sub-optimal.


That's obviously an implementation choice, and can happen to various
degrees. For instance, GNU awk understands the TEXTDOMAIN variable and
provides builtin functions for translating messages, but doesn't look at
LC_* or LANG except at startup. There's not really much in make that would
be modified based on changing locale variables, and nothing at all in env.
However, languages like, say, python that allow users to set variables do
offer methods to change the interpreter behavior based on programmers
setting variables or calling builtin locale functions.



   | That doesn't have to be all the shell does. The precedence hierarchy is
   | well-understood; there's nothing stopping the shell from implementing it:

No there's not, but I'm not sure that's the ideal result.   For example,
if a script does LC_CTYPE=C (or something similar) should it break if the
user happened to have set LC_ALL=something-different in the environment?


What does `break' mean?


If any other program does setlocale(LC_CTYPE, "C"); that has effect
for any future operations, regardless of what might have been in the
environment, why should a shell script be different?


Because the shell exposes the locale settings in a different way, offering
the user more flexibility?



   | noting that LC_ALL is set as a shell variable and making the right call
   | to setlocale() to make sure it overrides LC_CTYPE. I'd argue that having
   | just set LC_ALL, this is what the user expects here.

That's something like what I tried in my first attempt to implement
this, but it turns out not to really work very well. 


I think it works just fine, and bash users expect this behavior.


Certainly if
the user actually sets LC_ALL that is the effect it should have, but
if LC_ALL was set sometime much earlier (perhaps weeks earlier in an
interactive shell) what do you believe is the intent if that user later
sets LC_CTYPE (or any of them) - or if some script that is run does that?


I believe that users have agency, and that bash should trust that they know
what they're doing.


I know setting LC_ALL is something of a sledge hammer operation, which
isn't often appropriate, but it is also a fairly common example that
people copy (it avoids needing to work out which specific category
is affecting some operation or other).


OK. There's plenty of copypasta that has tripped up users before. Can the
shell protect them from that?



   | You're assuming a certain behavior and going on from there. The shell
   | doesn't have to do it that way.

Again, no it doesn't, but it seemed to me when I tried it, that this
way gives the most desirable outcome.

   | I'd argue that the shell should modify the locale categories that affect
   | its behavior.

How do we know which they are?   That is, locale settings can affect
libc operations in some cases, and if we're writing portable code
(which bash at least attempts to do, the NetBSD shell a little less
so) can you really be sure that some libc function that is being
used won't be affected by a locale setting that you've never heard of?


You really can't. All you can do is document the locale categories you
use, and which ones you modify based on shell variables. If the user sets
an environmnent variable, say, LC_NAME (one of the GNU libc extensions),
that's not going to affect the shell's behavior because the shell won't
call setlocale(LC_NAME, whatever). If LC_NAME is in the environment when
the shell starts, and setlocale(LC_ALL, "") pays attention to it, I'd
argue that that's the user's intent.



That's no issue at all if the shell just does setlocale(LC_ALL, "");
as part of 

Bash skips empty lines when reading history file in multiline mode

2025-04-04 Thread Jens Schmidt
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -Werror=implicit-function-declaration 
-fstack-protector-strong -fstack-clash-protection -Wformat 
-Werror=format-security -fcf-protection -Wall
uname output: Linux sappc1 6.12.17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 
6.12.17-1 (2025-03-01) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

[bashbug output manually adapted to real version ...]
Bash Version: 5.3
Commit: a6767763de5e7859107711b166a64a9e4a77a8ae
Release Status: compiled from recent devel branch


Description:
  Bash skips empty lines when reading history file in multiline mode.


Repeat-By:

Start Bash as "bash --norc --noprofile".  Then execute the
following commands:

- repro 1/2 -
shopt -s cmdhist lithist
HISTTIMEFORMAT='%F %T '
HISTFILE="/tmp/testhist"
cat << 'EOH' > $HISTFILE
#1

cat << 'EOF' | wc -l
abc

def

ghi
EOF

EOH
history -c
history -r
- repro 1/2 -

Finally execute a plain "history", which for me results in
the following output:

- repro 2/2 -
bash-5.3$ history
1  2025-04-04 22:09:31 history -r
2  1970-01-01 01:00:01 cat << 'EOF' | wc -l
abc
def
ghi
EOF
3  2025-04-04 22:09:34 history
- repro 2/2 -

That is, all the empty lines, most notably those in the here doc
have been skipped by Bash while reading the history.


Fix:

The following patch fixes this bug (if it is one for you):

diff --git a/lib/readline/histfile.c b/lib/readline/histfile.c
index 9a259146..eeae20b1 100644
--- a/lib/readline/histfile.c
+++ b/lib/readline/histfile.c
@@ -415,9 +415,10 @@ read_history_range (const char *filename, int from, int to)
else
  *line_end = '\0';
 
-   if (*line_start)
+   /* Process empty lines when reading multiline entries. */
+   if (*line_start || history_multiline_entries)
  {
-   if (HIST_TIMESTAMP_START(line_start) == 0)
+   if (!*line_start || HIST_TIMESTAMP_START(line_start) == 0)
  {
if (last_ts == NULL && history_length > 0 && 
history_multiline_entries)
  _hs_append_history_line (history_length - 1, line_start);


Thanks for maintaining Bash!