Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Chet Ramey
On 8/9/22 4:50 PM, Koichi Murase wrote:
> 2022年8月10日(水) 2:07 Alan Coopersmith :
 Thanks for the report. The eight-bit settings are auto-set once, when
 readline is first called, but I'll see if it makes sense to change them
 on every call.
>>>
>>> It's fairly easy. I'll make the change for the next devel branch push and
>>> bash-5.2-rc3.
>>
>> Thanks for the quick investigation!
> 
> Does it mean custom values of these readline variables will be lost
> every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
> intentionally sets them up?

It means those settings will now mirror the locale.

> We often temporarily change LANG or LC_* to perform some binary
> operations [such as counting the number of bytes of data and safely
> removing trailing x from the result of $(command;printf x)]. 

Do you often do this in interactive shells? Often enough to make a
difference? Across multiple calls to readline? And, if the change is
intended to be temporary, why would you not want the relevant
readline variables to reflect the locale when you were finished?


> Also, if these readline variables would be cleared every time, it
> seems to me that these readline variables would be effectively
> unconfigurable and would lose the point of their existence, or we
> could not touch LANG or LC_* at all after the initial setup.

It seems to me that the scenario Alan describes is much more common.


> Is it possible to make three states of the readline variables,
> `on/off/auto', and make `auto' the default, which determines the
> behavior depending on the current locale? In this case, the actual
> behavior on/off can be cached in another variable and can be updated
> on the change of LANG/LC_* when the readline variable has the value
> `auto'.

I'm not going to make this much of a change at this point in the release
process. I was willing to make the change I did because the changed
behavior is a superset of the previous behavior.

So, assuming we say that the scenario Alan outlined is reasonable (it is),
it looks like there are four alternatives:

1. Do nothing; maintain the bash-5.1 behavior and force the change to the
   user.

2. Leave the new function in place; automatically adjust to locale
   changes.

3. Push it off to the application: introduce a new readline API that
   applications can call when locale variables change. This is very cheap.

4. Push it onto readline: instead of checking the locale and making the
   eight-bit variables mirror it on each call, make readline check for
   locale changes (well, LC_CTYPE) and reset the eight-bit variables only
   if the current value doesn't match the value from the last call.

The last option is about as much of a change as I'm willing to make at
this point.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Chet Ramey
On 8/10/22 10:21 AM, Chet Ramey wrote:

> I'm not going to make this much of a change at this point in the release
> process. I was willing to make the change I did because the changed
> behavior is a superset of the previous behavior.
> 
> So, assuming we say that the scenario Alan outlined is reasonable (it is),
> it looks like there are four alternatives:
> 
> 1. Do nothing; maintain the bash-5.1 behavior and force the change to the
>user.
> 
> 2. Leave the new function in place; automatically adjust to locale
>changes.
> 
> 3. Push it off to the application: introduce a new readline API that
>applications can call when locale variables change. This is very cheap.
> 
> 4. Push it onto readline: instead of checking the locale and making the
>eight-bit variables mirror it on each call, make readline check for
>locale changes (well, LC_CTYPE) and reset the eight-bit variables only
>if the current value doesn't match the value from the last call.
> 
> The last option is about as much of a change as I'm willing to make at
> this point.

There is a fifth option:

5. Make the locale-aware behavior dependent on a new readline option, which
   would be enabled by default.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



cut loadable outputs extra newlines

2022-08-10 Thread Geir Hauge
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: darwin20.6.0
Compiler: gcc
Compilation CFLAGS: -g -O2
uname output: Darwin macspear 20.6.0 Darwin Kernel Version 20.6.0: Tue Apr 19 2$
Machine Type: x86_64-apple-darwin20.6.0

Bash Version: 5.2
Patch Level: 0
Release Status: rc3

Description:
examples/loadables/cut.c uses zgetline, which includes the newline
delimiter in the line it reads, but the rest of the code expects the
line to not contain the newline.

Repeat-By:
$ ./bash -c 'enable -f examples/loadables/cut cut ; cut -c4-' <<< 
$'abcdef\nghijkl'
def

jkl

$

Fix:
Strip off the newline right after reading the line.  Attached patch
fixes the problem.
diff --git a/examples/loadables/cut.c b/examples/loadables/cut.c
index ed4972d2..280c5231 100644
--- a/examples/loadables/cut.c
+++ b/examples/loadables/cut.c
@@ -417,6 +417,8 @@ cutfile (v, list, ops)
   while ((n = zgetline (fd, &line, &llen, '\n', unbuffered_read)) != -1)
{
  QUIT;
+ if (line[n] == '\n')
+   line[n] = '\0';
  cutline (v, line, ops);   /* can modify line */
}
   if (fd > 0)


Arithmetic expression: recursive VAR evaluation suppresses desired VAR assignment

2022-08-10 Thread Steffen Nurpmeso
Hello.

Given this file

  # make this work with (ba)sh \
  command -v shopt && shopt -s expand_aliases;\
  alias p=printf;alias e=echo;alias s=export
  s I1=I1=10 I2=5 I3=I2+=1;p "<$((I1=0?I1:I3))>";e "<$I1><$I2><$I3>"
  s I1=I1=10 I2=5 I3=I2+=1;p "<$((I1=1?I1:I3))>";e "<$I1><$I2><$I3>"

i now see (after having fixed yet another bug in my Dijkstra
implementation that was most heavily inspired by the one from
busybox, so that i wanted to donate it back with correct ?: etc):

  #?0|kent:tmp$ /x/src/busybox.git/busybox sh xxx.sh
  <6><0><6>
  <1><1><5>
  #?0|kent:tmp$ bash xxx.sh
  shopt
  <6><6><6>
  <10><10><5>

I think the busybox variant is correct.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: Arithmetic expression: recursive VAR evaluation suppresses desired VAR assignment

2022-08-10 Thread Koichi Murase
2022年8月11日(木) 9:01 Steffen Nurpmeso :
>   #?0|kent:tmp$ /x/src/busybox.git/busybox sh xxx.sh
>   <6><0><6>
>   <1><1><5>

It seems your busybox interprets« I1=0?I1:I3 » as « (I1=0)?I1:I3 »,
but this violates POSIX XCU 2.6.4 and XCU 1.1.2. Also, the above
behavior doesn't seem to be reproduced by recent versions of busybox.
I have tried 1.23.2, 1.35.0, and 1.36.0 (master 7c2a3bd), but all
produce the following result:

<6><6><6>
<10><10><6>

Which version of busybox do you use?

$ busybox | head -1

The behavior of recent versions of busybox is still broken. The third
number in the second line is 6, which implies that busybox evaluates
the discarded branch of the conditional operator (i.e., expr3 of
true?expr2:expr3). This doesn't happen in any other shells that
support the recursive arithmetic evaluation. I tried bash, ksh93,
mksh, zsh, posh, and osh, but all produced the same result as bash
while yash, ash, and dash do not support the recursive arithmetic
evaluation.

> I think the busybox variant is correct.

I think both your version and recent versions of busybox sh are broken.

--
Koichi



Re: Arithmetic expression: recursive VAR evaluation suppresses desired VAR assignment

2022-08-10 Thread Robert Elz
I would agree that the values bash is producing don't make a lot
sense, but I don't think you can say that either is correct - one
may be more desirable than the other, but that's it.

Var expansion (not $I but just I) is defined for $(( )) only when
I contains a integer constant (optionally signed) - anything else
supported is an extension, and can be defined to work however the
implementor wants it to work.

kre




Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Koichi Murase
2022年8月10日(水) 23:21 Chet Ramey :
> > Does it mean custom values of these readline variables will be lost
> > every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
> > intentionally sets them up?
>
> It means those settings will now mirror the locale.
>
> > We often temporarily change LANG or LC_* to perform some binary
> > operations [such as counting the number of bytes of data and safely
> > removing trailing x from the result of $(command;printf x)].
>
> Do you often do this in interactive shells?

Yes, but I don't mean I directly type the above kinds of commands in
the command line and run them, but I use them in the functions called
through `bind -x'.  Also, the above cases (counting bytes and removing
trailing x) are just examples; I set locale variables for various
purposes in the actual codes.  For example, I often type and run
commands of the form

  LANG=C some-commands-or-functions

to get the default error messages that are not locale-specific (though
I could use LC_MESSAGES=C instead, yet LANG=C is easier to type for
me).  I normally use the locale LANG=ja_JP.UTF-8 by default, so the
commands output error messages in Japanese by default.  This is not
useful when I would like to search for the solution on the internet
because there is almost no information on the Japanese error message.

> Often enough to make a difference?

My `bind -x' functions use `LC_ALL=' and `LC_CTYPE=C' for every
keystroke, for example, in combination with `builtin read'.  They also
use `LC_ALL=' for other purposes for mostly every keystroke.  Some vi
binding also uses `LC_CTYPE=C'.  My completion functions also change
`LC_ALL` and `LC_CTYPE`.  For example, `LC_CTYPE=C' is used in
calculating a PJW hash code of a given string.  I haven't carefully
checked, but there are probably other cases of changing `LC_CTYPE'.
Also, `LC_ALL=' is used everywhere.

> Across multiple calls to readline?

I think I am missing the point.  What does ``multiple calls to
readline'' mean?  Is the situation different from a single call to
readline?

Hmm, I think I first need to make it clear that the behavior of my
code, which is supposed to be sourced in an interactive session by
users, is unaffected by these readline settings.  I just do not want
to break or change the existing user settings inside the functions
that I provide.  The behavior of my functions is unaffected (except
for « bind -x '"\M-x":'  » which is affected by `convert-meta',
for which I already implemented a workaround) because it doesn't try
to communicate with readline inside a single call of `bind -x'.  The
problem is that, with the new automatic adjustment of these readline
variables, the settings by users can be lost after using `LC_ALL=' or
`LC_CTYPE=C' inside my functions.

I believe this is a general problem for writers of Bash
configurations. `bash_completion' also uses `LC_CTYPE=C' and
`LC_ALL=C'.  The behavior of such configurations itself will be
unaffected by the change of readline settings, but they need to
implement special treatment to preserve the user settings if the user
settings will be lost by changing locales.

> And, if the change is intended to be temporary, why would you not
> want the relevant readline variables to reflect the locale when you
> were finished?

Because I would not like to break the users' settings.  In general, a
third-party Bash configuration should not overwrite the users'
settings as far as the configuration does not need the setting.

> > Also, if these readline variables would be cleared every time, it
> > seems to me that these readline variables would be effectively
> > unconfigurable and would lose the point of their existence, or we
> > could not touch LANG or LC_* at all after the initial setup.
>
> It seems to me that the scenario Alan describes is much more common.

I agree with this point because I have also faced this problem
for « bind -x '"\M-x":...' » vs « convert-meta » before.  For this
problem, I have added a partial workaround at my side [1] where I
decided to save and restore `convert-meta' before and after running
`bind -x'.  Actually, the patch [2] I posted in this list before has
been a part of the workaround to this problem.

[1] 
https://github.com/akinomyoga/ble.sh/commit/f32808070796d3978787f4491f812d06a629ab3f
[2] https://lists.gnu.org/archive/html/bug-bash/2019-02/msg00036.html

I agree that we should somehow change the current behavior that the
default values of *-meta settings are determined by the locale on the
startup of Bash, but the proposed change will break the opposite
scenario while it solves Alan's scenario.

The combination (UTF-8 & 7bit-mode) doesn't make much sense, so we
might force (UTF-8 & 8bit-mode) for UTF-8 or similar for multibyte
character encodings with 8-bit bytes.  [ Note: Here, 7bit/8bit-mode
means « convert-meta on/off » and « {input,output}-meta off/on »,
respectively. ] However, on the opposite side of the single-byte
character encoding (e.g. for C), I t