Bash extended glob - Negation

2021-04-30 Thread Inian Vasanth
Configuration Information:
Machine: x86_64
OS: darwin20.2.0
Compiler: clang
Compilation CFLAGS: -DSSH_SOURCE_BASHRC
uname output: Darwin C02YW0DVLVCH 20.3.0 Darwin Kernel Version 20.3.0: Thu
Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
Machine Type: x86_64-apple-darwin20.2.0

Bash Version: 5.1
Patch Level: 4
Release Status: release

Description:

A wildcard expression after a negate extglob construct is not returning
right results.

Steps

Consider the following files and assume, that I want to exclude files
ending with 8

shopt -s extglob
touch atop_20210330 atop_20210427 atop_20210428
ls -1 atop_20210@(3|4)+([0-4])!(8)
atop_20210330
atop_20210427

But when I add an extra * at the end of the expression like

ls -1 atop_20210@(3|4)+([0-4])!(8)*
atop_20210330
atop_20210427
atop_20210428

Why is it the case? Shouldn't the glob expression just serve the !(8)
pattern to exclude and simply do a no-op for the extra * character?


-- 
Regards,
INIAN VASANTH P


Re: Bash extended glob - Negation

2021-04-30 Thread Robert Elz
Date:Fri, 30 Apr 2021 12:28:32 +0530
From:Inian Vasanth 
Message-ID:  


  | Why is it the case? Shouldn't the glob expression just serve the !(8)
  | pattern to exclude and simply do a no-op for the extra * character?

The '*' means "sero or more of the preceding", the preceding was
"not an 8" so anything which ends with zero (or more) not 8's matches.
Since everything ends with 0 (or more) not 8's (or not anything elses)
everything matches.

kre




Re: Add {{ }}, ${{ }}, return n [x], and $:

2021-04-30 Thread Robert Elz
Date:Fri, 30 Apr 2021 14:53:58 +0800
From:konsolebox 
Message-ID:  


  | {{ }} - Similar to { }, but runs like a function so variables can be
  | declared local

>From below, I suspect you mean more, though you haven't fully defined what you
mean, but extending local to be able to apply in a block is not a terrible
idea, I have considered it from time to time (not for bash which I have
nothing to do with except as a user) - but needs no special syntax, just the
implementation, and removal of the restriction on the local builtin (or
whatever underlies that in bash) that only permits it when a function is
active.   I don't know about bash, but in the shell I maintain the
implementation would be trivial.

But I suspect that you want this to be more like (what I think is called)
a lambda expression - that is, a essentially a function definition that
is immediately called, and then forgotten, but it isn't clear if you want
it to be able to have parameters or not.

  | return n [x] - Assigns n to $? and x to $: if specified

I think there are two aspects to that.  First, you want return to work
inside the proposed {{ }} syntax - that would need the new syntax, it cannot
be done with a regular { } block without breaking backwards compat.

And second to have two (just 2, or can the 'x' be anything?) return
values, one being the exit status, and the other becoming the value of
a magic new special parameter.   Does this variant of return work in
regular functions as well?

The last part of that is very unlikely - I have toyed with (and actually 
implemented a few times) new special parameters, but never released a
version with a new one in it - aside from (perhaps, and only perhaps) zsh
I don't think there's ever been a new special parameter added to any
shell, ever, since the original set in the first Bourne sh from 1979 (or
before).   For some reason that list is considered set in stone.

A more normal approach (more common approach for modern shell extensions)
would be to use a traditional, but special, variable name (kind of like
bash/ksh uses REPLY for read, and shells use OPTIND OPTARG etc with getopts).

But of course, if you do it that way, then instead of

return n x

you can already do

MAGIC_VAR=x
return n

and not have to alter the syntax of return.

  | ${{ }} - Executes like {{ }} but also expands to the value of $:

That would need a lot more specification.   Is there a subshell involved?
what's the syntax for the enclosed (missing here) piece, and how is it
supposed to be parsed?   One assumes that everything there is intended to
be a part of a word, are the results subject to later expansions?

  | $: will be reset to an empty string at the beginning of any
  | non-assignment execution context.

I'm not sure exactly what that means, but regardless, why?   Whether it
is a special parameter, or just a variable (with a value used by magic in
some cases) I'd expect it to work more like $! - something causes it
to be set, then it simply retains its value until something else happens
that makes it change value - in the interim it can be referenced as
many times as you want.

  | The string can be a constant static for the sake of efficiency.

I don't know about in the internals of bash, but it almost certainly
can't.  I also wouldn't worry about efficiency for something as trivial
as this (saving a string value) - shells do a LOT of that.

  | The greatest goal of these features is to help eradicate the
  | inefficient var=$(echo "returned value") practice people do.

So, an alternative would simply be to make that efficient...   In cases
like that there's no need for the shell to fork (though most do).
Doing this is just a matter of careful coding, it requires no new syntax
or anything else that affects any existing scripts (where {{, though
not ${{ might) and has the benefit that everyone benefits, without needing
to rewrite code.   Then it just becomes a question of whether those benefits
are worth the cost and complexity of the implementation.

  | Second
  | would be so I'm sure I don't unset a global variable when unsetting an
  | iterator in an initialization script to keep the variables clean, even
  | just by theory.

The local variables in blocks part of your proposal is the one part that
is both realistic and potentially useful (and probably fairly easy).

kre




Re: Bash extended glob - Negation

2021-04-30 Thread Oğuz
30 Nisan 2021 Cuma tarihinde Robert Elz  yazdı:

> Date:Fri, 30 Apr 2021 12:28:32 +0530
> From:Inian Vasanth 
> Message-ID:   3...@mail.gmail.com>
>
>   | Why is it the case? Shouldn't the glob expression just serve the !(8)
>   | pattern to exclude and simply do a no-op for the extra * character?
>
> The '*' means "sero or more of the preceding", the preceding was
> "not an 8" so anything which ends with zero (or more) not 8's matches.


No, this isn't regex. `*' alone means zero or more of anything. What OP
doesn't understand here is that `!(8)' can match the empty string. A
simpler example would be:

$ touch a b c
$ echo !([ab])
c
$ echo !([ab])*
a b c


> Since everything ends with 0 (or more) not 8's (or not anything elses)
> everything matches.
>
> kre
>
>
>

-- 
Oğuz


Re: Add {{ }}, ${{ }}, return n [x], and $:

2021-04-30 Thread konsolebox
On Fri, Apr 30, 2021 at 4:09 PM Robert Elz  wrote:
>
> Date:Fri, 30 Apr 2021 14:53:58 +0800
> From:konsolebox 
> Message-ID:  
> 
>
>   | {{ }} - Similar to { }, but runs like a function so variables can be
>   | declared local
>
> From below, I suspect you mean more, though you haven't fully defined what you
> mean, but extending local to be able to apply in a block is not a terrible
> idea,

That was my initial idea too, but I think it's better to just leave {
} alone as modifying its behavior still has a chance of breaking
scripts.

But maybe a shell option can be introduced.

In that case {} will almost be similar to {{ }} except that calling
return inside would exit the function.

> but it isn't clear if you want
> it to be able to have parameters or not.

It will inherit or access the parent context's arguments just like in
{ } and ( ), virtually.

>   | return n [x] - Assigns n to $? and x to $: if specified

> I think there are two aspects to that.  First, you want return to work
> inside the proposed {{ }} syntax - that would need the new syntax, it cannot
> be done with a regular { } block without breaking backwards compat.

I don't think it would break compatibility.  `return n` will still
work.  Is there something I missed?

> And second to have two (just 2, or can the 'x' be anything?) return
> values, one being the exit status, and the other becoming the value of
> a magic new special parameter.   Does this variant of return work in
> regular functions as well?

Yes that's the plan.

>   | ${{ }} - Executes like {{ }} but also expands to the value of $:
>
> That would need a lot more specification.   Is there a subshell involved?

There won't be, just like in {{ }}.

> what's the syntax for the enclosed (missing here) piece, and how is it
> supposed to be parsed?   One assumes that everything there is intended to
> be a part of a word, are the results subject to later expansions?

${{ }} would be parsed and expanded just like $() but commands within
it will run in the context of a temporary anonymous function.

>   | $: will be reset to an empty string at the beginning of any
>   | non-assignment execution context.
> I'm not sure exactly what that means, but regardless, why?   Whether it
> is a special parameter, or just a variable (with a value used by magic in
> some cases) I'd expect it to work more like $! - something causes it
> to be set, then it simply retains its value until something else happens
> that makes it change value - in the interim it can be referenced as
> many times as you want.

It's to emulate how $? behaves, and also so people would write more
readable code.  The value of $: should be perceived as temporary.
Keeping it unchanged for a long time makes it difficult to find the
source that assigned it unless you wrote the code yourself.  I believe
each function should also have its formal default return value so an
old value won't be returned if a command fails to execute.

>
>   | The string can be a constant static for the sake of efficiency.
>
> I don't know about in the internals of bash, but it almost certainly
> can't.  I also wouldn't worry about efficiency for something as trivial
> as this (saving a string value) - shells do a LOT of that.

It's an optional enhancement to implementation.

>   | The greatest goal of these features is to help eradicate the
>   | inefficient var=$(echo "returned value") practice people do.
>
> So, an alternative would simply be to make that efficient...   In cases
> like that there's no need for the shell to fork (though most do).
> Doing this is just a matter of careful coding, it requires no new syntax
> or anything else that affects any existing scripts (where {{, though
> not ${{ might) and has the benefit that everyone benefits, without needing
> to rewrite code.

This optimization can be done as well as a separate target but $() can
never be optimized enough as it almost would always rely on an opened
pipe.  It would also be difficult to tell if one of the commands
within it would send a message to &1 or not, or if one of them would
execute `exec`.  You also would almost always want to enable the
subshell to avoid the parent from getting its parameters altered.

> Then it just becomes a question of whether those benefits
> are worth the cost and complexity of the implementation.

Not sure about the complexity but I  think the benefits would be worth it.


--
konsolebox



Re: RFE - support option for curses idea of term [tab]size.

2021-04-30 Thread Greg Wooledge
On Thu, Apr 29, 2021 at 07:39:37PM -0700, L A Walsh wrote:
>It doesn't always do it correctly because it doesn't always know
> where it is in a line.  As such it has a bug that would be fixed by
> having it really know where it was at (from talking with libcurses)
> as well as what the tabstops were really set to.

First of all, "just link with libcurses" does not magically grant the
power to know the cursor position in all terminals.  Many of them have
no such capability at all, and of the ones that do, the means of
learning that information differs greatly.

Even curses applications don't try to ask the terminal where the cursor
is.  They *put* the cursor where they need it to be, and keep track of
where they left it.

Second of all, converting bash/readline to use curses instead of the
primitive terminal calls that it currently uses would involve a whole
new layer of complexity that I do not feel is justified, or even wanted.

Curses applications take control of the entire terminal.  They are
initialized, and they must be cleaned up.  At any time in the middle,
the stats of the terminal is not really well-defined.  The curses
application could be in the middle of some operation.  Any outside entity
that touches the terminal in any way could mess everything up.

But with shell scripts (or interactive shells), the normal operating
procedure is for the shell to spend a lot of its time waiting for an
external command to finish.  That external command may want to write
to the terminal, or even read from it.

This works out fine for bash as it currently exists, because bash turns
over control of the terminal in a relatively well-defined state, and
resumes control of the terminal when the external program is finished.
The only assumption bash makes upon resumption is that the cursor will be
in the leftmost column.



Re: RFE - support option for curses idea of term [tab]size.

2021-04-30 Thread Koichi Murase
2021年4月30日(金) 11:41 L A Walsh :
> On 2021/04/26 17:16, Chet Ramey wrote:
> > Readline is tab-agnostic, or tab-stop-agnostic, in a sense. It
> > performs tab
> > expansion itself during redisplay, and currently uses a tab stop of 8.
> > That's not user-settable.
> ---
> It doesn't always do it correctly because it doesn't always know
> where it is in a line.

I think Readline always knows the current cursor position in terminals
of any tabstops because it just never sends TABs to terminals. "It
performs tab expansion itself" means that Readline internally converts
TABs to an appropriate number of spaces using virtual tabstops of 8.
In fact, I rearranged tabstops to 4n by the following command (in
Mintty which supports CTC [CSI W] and HTS [ESC H]), but Readline
continues to align characters with tabstops of 8n and there are no
glitches.

$ printf '\r\e[5W'; for a in {4..80..4}; do printf  '\e[4C\eH'; done

Maybe it is more friendly to reflect the current terminal settings of
tabstops, but I wouldn't call it a bug that Readline doesn't emulate
it.

> That readline can't be used with a variable font or tabsize seems like
> a limitation associated with only relating to computers through a fixed-size
> window w/fixed-sized characters & fixed tabstops.

I don't think curses/ncurses or any other terminal applications
support the cursor movements taking account of variable character
sizes of proportional fonts. Currently, there is no terminal protocol
to retrieve/control the variable width of each character in the
proportional fonts. It's not the specific problem of Readilne.



Re: RFE - support option for curses idea of term [tab]size.

2021-04-30 Thread Chet Ramey

On 4/29/21 10:39 PM, L A Walsh wrote:



On 2021/04/26 17:16, Chet Ramey wrote:

On 4/26/21 7:19 PM, L A Walsh wrote:

I'm not clear if termcap lib has this or not, when the curses
library is in use, it supports the idea of reading and setting
the term [tab]size.


Can't you do this with `stty size' already? 

Setting size: sometimes, depends on the Terminal, but having readline
know about where tabs expand to is only handled through libcurses and
not device driver, which I believe is where stty asserts its effects.


OK, so we are talking about tab stops here, not the terminal size as
you originally wrote.

Readline doesn't know or care where tabs expand.



Readline is tab-agnostic, or tab-stop-agnostic, in a sense. It performs tab
expansion itself during redisplay, and currently uses a tab stop of 8.
That's not user-settable.

---
   It doesn't always do it correctly because it doesn't always know
where it is in a line.  As such it has a bug that would be fixed by
having it really know where it was at (from talking with libcurses)
as well as what the tabstops were really set to.


If you want to report a bug, you're going to have to be more specific.



   I.e. it would be more user-friendly if readline considered the knowledge
of the terminal[-emulator] that it is running under when possible.  Taking
pride in doing the wrong thing that doesn't match the terminals settings
shouldn't really be considered a plus or a design goal.


Ah, so what you would like is a user-settable value for tab stops. What
value do you think this would provide to the user community?



That readline can't be used with a variable font or tabsize seems like
a limitation associated with only relating to computers through a fixed-size
window w/fixed-sized characters & fixed tabstops.


Readline doesn't know the character widths of variable-width fonts, and
there is no portable way to discover it. How would you solve this problem?
(And if you think curses is the way to solve it, look for another
solution.)


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Bash development roadmap

2021-04-30 Thread Léa Gris
As I see periodic features requests for Bash in this list. They most 
often misses some background plan or justification beyond QOL 
improvement for script coder, so they can have same feature as with 
other language.


I always thought that Bash DNA was tied to orchestrating actions from 
commands forming a shell around a Unix kernel, with characters streams 
forming the data backbone for Bash to interact with system commands.


The strength of Bash is that it is exactly fit as a shell, and it has 
enough POSIX and even Bash versions available in a wide variety of 
systems environments ; so that if you write Bash scripts avoiding 
cutting-edge features, or limit yourself to features with a decade old 
maturity, you can expect mostly flawless compatibility.


Then I have concerns about all the requests for implementing new 
features, especially those features that would turn Bash into an 
all-purpose programming language, loosing grounds with its designed role 
in Unix systems.


I'd like to see more mid-term or long term plans to keep Bash relevant 
ten years from now, with systems evolving more with event-driven 
operations; processes exchanging more structured data streams such as 
JSON or XML. Bash can barely deal with these format with external 
parsers, but then struggle to work with the data because it has no 
built-in internal hierarchical structures for it.


I remember Chet mention future modules to deal with various formats, and 
it feels like a sound approach to deal with these structured data 
formats, but still Bash will struggle to use these with only arrays and 
associative arrays.


I also wonder if it is even realistic to get Bash evolving to keep-up 
with more modern data structures and formats, when other scripting 
languages like Python are increasingly occupying the place of shell 
scripts. I can see how Perl lost grounds while it gained features but 
loosing relevance.


What is on Bash's roadmap for the next ten or twenty years for it to 
remain a relevant tool, or is it going to maintain status-quo as a 
fall-back scripting tool you expect being on every systems, even older 
and no longer supported ones?


--
Léa Gris




Re: Bash extended glob - Negation

2021-04-30 Thread Chet Ramey

On 4/30/21 2:58 AM, Inian Vasanth wrote:


Bash Version: 5.1
Patch Level: 4
Release Status: release

Description:

A wildcard expression after a negate extglob construct is not returning
right results.

Steps

Consider the following files and assume, that I want to exclude files
ending with 8

shopt -s extglob
touch atop_20210330 atop_20210427 atop_20210428
ls -1 atop_20210@(3|4)+([0-4])!(8)
atop_20210330
atop_20210427

But when I add an extra * at the end of the expression like

ls -1 atop_20210@(3|4)+([0-4])!(8)*
atop_20210330
atop_20210427
atop_20210428

Why is it the case? Shouldn't the glob expression just serve the !(8)
pattern to exclude and simply do a no-op for the extra * character?


Don't assume the !(8) has to match one character; it can match zero.
If you want to make sure to match filenames ending in something other
than 8, use an operator like [!8] that will match one character.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Add {{ }}, ${{ }}, return n [x], and $:

2021-04-30 Thread Robert Elz
Date:Fri, 30 Apr 2021 19:28:47 +0800
From:konsolebox 
Message-ID:  


  | That was my initial idea too, but I think it's better to just leave {
  | } alone as modifying its behavior still has a chance of breaking
  | scripts.

It does have, though most usage of local is right at the start of
a function, so there would not be many scripts affects.  But there
could be some.

  | It will inherit or access the parent context's arguments just like in
  | { } and ( ), virtually.

So, no specific args (an argumentless function) - that's what I
expected but these things need to be clear.

  | I don't think it would break compatibility.  `return n` will still
  | work.  Is there something I missed?

I meant that return cannot be modified to mean "leave the block" in a { }
block, it always needs to exit the enclosing function, so if that
functionality is needed, a new syntax would be required (I'm not sure it
really is needed though).

  | It's to emulate how $? behaves,

Aside from being init'd to 0 when the shell starts, rather than unset,
$? works just the same as $!, its value remains until something occurs to
alter it.   It is true that far more things set exit status than background 
pid, but that's just a matter of degree, nothing operational.

  | and also so people would write more readable code.

Good luck with that objective.

  | >   | The string can be a constant static for the sake of efficiency.

  | It's an optional enhancement to implementation.

The point was that it isn't an enhancement - the only way to use a static
array of chars is to have a size limit on the value, and that's both ugly
to implement and to use.   And there's no point.

  | This optimization can be done as well as a separate target but $() can
  | never be optimized enough as it almost would always rely on an opened
  | pipe.

No, if there's no fork, there's no pipe either, the pipe simply allows
inter-process communications (simply - it could also be done using shared
memory, but that turns out to be far more complicated, and needs to turn
into a pipe if an external process ends up being run anyway).

  | It would also be difficult to tell if one of the commands
  | within it would send a message to &1 or not,

No, not difficult.   Only code built into the shell (obviously) can
execcute in the shell without forking, and the shell knows what is
built in (and in shells like bash where something might be dynamically
loaded running such a command can be treated the same as running an
external command).   Whenever an external command is to be run, a fork
is always required, so it may as well be done early, rather than late.

  | or if one of them would execute `exec`.

Same thing, if anything in what should be a subshell env is to exec
or change the shell's state (in a way that cannot easily be undone)
the shell simply forks.

  | You also would almost always want to enable the
  | subshell to avoid the parent from getting its parameters altered.

Many cases, yes, the point is that not forking works in the simple
cases where it is most desired to not fork for speed $(echo ...) or
more probably $(printf ... ) but in more than just those cases,
extracting info from many shell built-in commands ( nfiles=$(ulimit -Sn) )
can be handled without forking.

Don't misunderstand though, getting this right is not trivial, detecting
when it is safe requires a bunch of code, and handling issues like very large
output streams (which would normally simply fill the pipe and hang a forked
process until read) take care.

It is however possible, and when implemented, simply works in the cases
where it is possible, with all scripts, new and old.

The problem with new invented features is that they tend to only work in
one shell (at least initially) which means people prefer not to use them,
in order to make their scripts more portable, which means other implementors
are under no pressure to copy the feature...Implemented optimisations
for the standard shell syntax simply work, and improve performance, while
still allowing the script to work anywhere.

kre