from:"Stephane CHAZELAS"

[PATCH] confusing/obsolete handling of test -t operator (and doc warnings against using -o/-a)

2023-07-05 Thread Stephane Chazelas

Hello,

test -t X

Always returns false and doesn't report an error about that
invalid number (beside the point here, but in ksh/zsh, that X is treated
as an arithmetic expression and evaluates to 0 if $X is not set).

While:

test -t X -a Y

returns a "too many arguments" error.

test X -a -t -a Y

returns false (without error and regardless of whether any fd is a tty)
while

test X -a Y -a -t

returns true

While for other unary operators that gives:

$ bash -c 'test X -a -x -a Y'
bash: line 1: test: too many arguments

No big deal as in all those the behaviour is unspecfied by
POSIX (non-numeric argument to -t, or more than 4 arguments, -a
deprecated).

It seems to be explained by what looks like a remnant from the
time where [ -t ] was short for [ -t 1 ] in the code of
unary_operator() in test.c

>  /* the only tricky case is `-t', which may or may not take an argument. */

Not anymore.

>  if (op[1] == 't')
>{
>  advance (0);
>  if (pos < argc)
>   {
> if (legal_number (argv[pos], &r))
>   {
> advance (0);
> return (unary_test (op, argv[pos - 1], 0));
>   }
> else
>   return (FALSE);

Maybe the intention was to do a isatty(1) here instead of always
returning false, but that and the fact that advance() is not called only
confuses things.

>   }
>  else
>   return (unary_test (op, "1", 0));

That part is never reached AFAICT as unary_operator() is never
called with pos == argc.

I beleive that whole code can go as -t is now always a unary
operator, and it would be more useful to report an error when
the operand is not a number.

I also noticed that the fact that -a/-o were deprecated (by POSIX at
least) and made for unreliable test expressions was not noted in the
manual. So I suggest the patch below:

diff --git a/doc/bashref.texi b/doc/bashref.texi
index 85e729d5..00fbab69 100644
--- a/doc/bashref.texi
+++ b/doc/bashref.texi
@@ -4215,14 +4215,14 @@ Operator precedence is used when there are five or more 
arguments.
 @item ! @var{expr}
 True if @var{expr} is false.
 
-@item ( @var{expr} )
+@item ( @var{expr} ) (DEPRECATED)
 Returns the value of @var{expr}.
 This may be used to override the normal precedence of operators.
 
-@item @var{expr1} -a @var{expr2}
+@item @var{expr1} -a @var{expr2} (DEPRECATED)
 True if both @var{expr1} and @var{expr2} are true.
 
-@item @var{expr1} -o @var{expr2}
+@item @var{expr1} -o @var{expr2} (DEPRECATED)
 True if either @var{expr1} or @var{expr2} is true.
 @end table
 
@@ -4283,11 +4283,26 @@ Otherwise, the expression is parsed and evaluated 
according to
 precedence using the rules listed above.
 @end enumerate
 
-@item 5 or more arguments
+@item 5 or more arguments (DEPRECATED)
 The expression is parsed and evaluated according to precedence
 using the rules listed above.
 @end table
 
+In the 4 or 5 arguments case, the use of @samp{(}, @samp{)}, binary
+@samp{-a}, binary @samp{-o} make for unreliable test expressions. For
+instance @code{test "$x" -a ! "$y"}  becomes a test for whether a
+@samp{!} file exists if @code{$x} is @samp{(} and @code{$y} is
+@samp{)} and @code{[ -f "$file" -a ! -L "$file" ]} fails with a
+syntax error for a file called @samp{==}. Which explains why those
+are deprecated as they have been in the POSIX specification of the
+@code{test} utility since 2008.
+
+Each invocation of @code{[} / @code{test} should perform a single test
+and several invocations may be chained with the @code{&&} or @code{||}
+shell operators to achieve the same result as the @code{-a} and
+@code{-o} operators reliably as in @code{test "$x" && test ! "$y"} or
+@code{[ -f "$file" ] && [ ! -L "$file" ]} in the examples above.
+
 When used with @code{test} or @samp{[}, the @samp{<} and @samp{>}
 operators sort lexicographically using ASCII ordering.
 
diff --git a/test.c b/test.c
index 2b12197a..e16337a5 100644
--- a/test.c
+++ b/test.c
@@ -476,24 +476,6 @@ unary_operator (void)
   if (test_unop (op) == 0)
 return (FALSE);
 
-  /* the only tricky case is `-t', which may or may not take an argument. */
-  if (op[1] == 't')
-{
-  advance (0);
-  if (pos < argc)
-   {
- if (legal_number (argv[pos], &r))
-   {
- advance (0);
- return (unary_test (op, argv[pos - 1], 0));
-   }
- else
-   return (FALSE);
-   }
-  else
-   return (unary_test (op, "1", 0));
-}
-
   /* All of the unary operators take an argument, so we first call
  unary_advance (), which checks to make sure that there is an
  argument, and then advances pos right past it.  This means that
@@ -603,7 +585,7 @@ unary_test (char *op, char *arg, int flags)
 
 case 't':  /* File fd is a terminal? */
   if (legal_number (arg, &r) == 0)
-   return (FALSE);
+   integer_expected_error (arg);
   return ((r == (int)r) && isatty ((int)r));
 
 case 'n':  /* True if arg has some length. */

Re: [PATCH] confusing/obsolete handling of test -t operator (and doc warnings against using -o/-a)

2023-07-08 Thread Stephane Chazelas

2023-07-07 15:52:28 -0400, Chet Ramey:
[...]
> Historical versions of test made the argument to -t optional here. I can
> continue to support that in default mode for backwards compatibility, but
> it will be an error in posix mode.
[...]

I think you may have overlooked the bottom part of my email
(possibly because it was hidden by your MUA as it included
quoted text) that included comments on the code and a patch.

bash hasn't supported [ -t ] as an alias for [ -t 1 ] since 2.02
and possibly earlier AFAICT since it started supporting the
POSIX rules where [ any-non-empty-single-argument ] returns
true, and having [ -t ] to check whether stdout is a terminal is
not allowed.

The problem here is that some code to support that haven't been
removed at the time the POSIX rules were implemented. The patch
I suggested just removes that code.

ksh93 does support [ -t ] when the -t is literal:

$ ksh93 -c '[ -t ]' > /dev/null || echo stdout is not a terminal
stdout is not a terminal
$ ksh93 -c '[ "-t" ]' > /dev/null || echo stdout is not a terminal
stdout is not a terminal
$ var=-t ksh93 -c '[ "$var" ]' > /dev/null && echo '$var is non-empty'
$var is non-empty

But there's no point going there since that breaks POSIX compliance
for no good reason as [ -t ] as an alias for [ -t 1 ] hasn't
been supported for decades so scripts that were doing [ -t ]
would have long been fixed to [ -t 1 ].

[...]
> > I also noticed that the fact that -a/-o were deprecated (by POSIX at
> > least) and made for unreliable test expressions was not noted in the
> > manual. So I suggest the patch below:
> 
> I added some language about this, noting that POSIX has deprecated them
> and recommending scripts not use them. Thanks for the suggestion.
[...]

Note that "(" and ")" are also obsoleted by POSIX and as a
result any usage of test with 5 or more arguments (hence why I
flagged them as (DEPRECATED) in the doc patch I was suggesting.

-- 
Stephane

Re: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Stephane Chazelas

2023-08-31 10:35:59 -0500, Eric Blake via austin-group-l at The Open Group:
> In today's Austin Group call, we discussed the fact that printf(1) has
> mandated behavior for %b (escape sequence processing similar to XSI
> echo) that will eventually conflict with C2x's desire to introduce %b
> to printf(3) (to produce 0b000... binary literals).
[...]

Is C2x's %b already set in stone?

ksh93's printf (and I'd  expect ast's standalone printf) has
%[,[,]d to output a number in an
arbitrary base which IMO seems like a better approach than
introducing a new specifier for every base.

$ printf '%..2d\n' 63
11
$ printf '0b%.8.2d\n' 63
0b0011
$ printf '%#.8.2d\n' 63
2#0011

The one thing it can't do though is left-space-padding of 0b.

printf %b is used in countless scripts especially the more
correct/portable ones that use it to work around the portability
fiasco that is echo's escape sequence expansion. I can't imagine
it going away. Hard to imagine the C folks overlooked it, I'd
expect printf %b to be known by any shell scripter.

-- 
Stephane

Re: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Stephane Chazelas

2023-09-01 07:13:36 +0100, Stephane Chazelas via austin-group-l at The Open 
Group:
> 2023-08-31 10:35:59 -0500, Eric Blake via austin-group-l at The Open Group:
> > In today's Austin Group call, we discussed the fact that printf(1) has
> > mandated behavior for %b (escape sequence processing similar to XSI
> > echo) that will eventually conflict with C2x's desire to introduce %b
> > to printf(3) (to produce 0b000... binary literals).
> [...]
> 
> Is C2x's %b already set in stone?
> 
> ksh93's printf (and I'd  expect ast's standalone printf) has
> %[,[,]d to output a number in an
> arbitrary base which IMO seems like a better approach than
> introducing a new specifier for every base.
[...]

For completeness, several shells also support expanding integers
in arbitrary bases.

Like ksh's

typeset -i2 binary=123

already there in ksh85, possibly earlier, also available in
pdksh and derivatives and zsh.

Originally with the base number not specified the output base
was derived from the first assignment like typeset -i var;
var='2#111' would get you a $var that expands in binary. Looks
like that was discontinued in ksh93, but it's still there in
mksh or zsh.

And there's also:

$ echo $(( [#2] 16 )) $(( [##2] 16 ))
2#1 1

In zsh (note that you don't get 0b1 upon $(( [#2] 16 ))
after set -o cbases).

If bash added:

printf -v var %..2 16

à la ksh93, that would bridge that gap.

How to output/expand numbers in bases other thn 8, 10, 16 is a
recurring question for bash, with people generally surprised
that it can *input* numbers in any base, but not *output* in any
base.

See
https://unix.stackexchange.com/questions/415077/how-to-add-two-hexadecimal-numbers-in-a-bash-script/415107#415107
https://unix.stackexchange.com/questions/616215/bash-arithmetic-outputs-result-in-decimal
https://unix.stackexchange.com/questions/749988/arbitrary-base-conversion-from-base-10-using-only-builtins-in-bash
to list only a few.

-- 
Stephane

Re: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas

2023-09-01 09:44:08 +0300, Oğuz via austin-group-l at The Open Group:
> On Fri, Sep 1, 2023 at 7:41 AM Phi Debian  wrote:
> > My vote is for posix_printf %B mapping to libc_printf %b
> 
> In the shell we already have bc for base conversion. Does POSIX really
> have to support C2x %b in the first place?

Yes, though note:

- that implies forking a process and loading an external
  executable and its libraries
- bc is not always available. It's not installed by default on
  Debian for instance.
- for bases over 16, it uses some unusual representation that
  can't be used anywhere.

A summary of some options for some common POSIX-like shells at
https://unix.stackexchange.com/questions/191205/bash-base-conversion-from-decimal-to-hex/191209#191209

-- 
Stephane

Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas

2023-08-31 15:02:22 -0500, Eric Blake via austin-group-l at The Open Group:
[...]
> The current POSIX says that %b was added so that on a non-XSI
> system, you could do:
> 
> my_echo() {
>   printf %b\\n "$*"
> }

That is dependant on the current value of $IFS. You'd need:

xsi_echo() (
  IFS=' '
  printf '%b\n' "$*"
)

Or the other alternatives listed at
https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo/65819#65819

[...]
> Bash already has shopt -s xpg_echo

Note that in bash, you need both

shopt -s xpg_echo
set -o posix

To get a XSI echo. Without the latter, options are still
recognised. You can get a XSI echo without those options with:

xsi_echo() {
  local IFS=' ' -
  set +o posix
  echo -e "$*\n\c"
}

The addition of those \n\c (noop) avoids arguments being treated as
options if they start with -.


[...]
> The Austin Group also felt that standardizing bash's behavior of %q/%Q
> for outputting quoted text, while too late for Issue 8, has a good
> chance of success, even though C says %q is reserved for
> standardization by C. Our reasoning there is that lots of libc over
> the years have used %qi as a synonym for %lli, and C would be foolish
> to burn %q for anything that does not match those semantics at the C
> language level; which means it will likely never be claimed by C and
> thus free for use by shell in the way that bash has already done.
[...]

Note that %q is from ksh93, not bash and is not portable across
implementations and with most including bash's gives an output
that is not safe for reinput in arbitrary locales (as it uses
$'...' in some cases), not sure  it's a good idea to add it to
the standard, or at least it should come with fat warnings about
the risk in using it.

See also:

https://unix.stackexchange.com/questions/379181/escape-a-variable-for-use-as-content-of-another-script/600214#600214

-- 
Stephane

Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas

2023-09-01 07:15:14 -0500, Eric Blake:
[...]
> > Note that in bash, you need both
> > 
> > shopt -s xpg_echo
> > set -o posix
> > 
> > To get a XSI echo. Without the latter, options are still
> > recognised. You can get a XSI echo without those options with:
> > 
> > xsi_echo() {
> >   local IFS=' ' -
> >   set +o posix
> >   echo -e "$*\n\c"
> > }
> > 
> > The addition of those \n\c (noop) avoids arguments being treated as
> > options if they start with -.
> 
> As an extension, Bash (and Coreutils) happen to honor \c always, and
> not just for %b.  But POSIX only requires \c handling for %b.
> 
> And while Issue 8 has taken steps to allow implementations to support
> 'echo -e', it is still not standardized behavior; so your xsi_echo()
> is bash-specific (which is not necessarily a problem, as long as you
> are aware it is not portable).
[...]

Yes, none of local (from ash I believe), the posix option
(several shells have an option called posix all used to improve
POSIX conformance, bash may have been the first) nor -e (from
Research Unix v8) are standard, that part was about bash
specifically (as the thread is also posted on gnu.bash.bug).

BTW, that xsi_echo is not strictly equivalent to a XSI echo in
the case where the last character of the last argument is an unescaped
backslash or a character whose encoding ends in the same byte as
the encoding of backslash.

-- 
Stephane

Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas

2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group:
[...]
> > Well in all case %b can not change semantic in the bash script, since it is
> > there for so long, even if it depart from python, perl, libc, it is
> > unfortunate but that's the way it is, nobody want a semantic change, and on
> > next routers update, see the all internet falling appart :-)
> 
> How many scripts in the wild actually use %b, though?  And if there
> are such scripts, anything we can do to make it easy to do a drop-in
> replacement that still preserves the old behavior (such as changing %b
> to %#s) is going to be easier to audit than the only other
> currently-portable alternative of actually analyzing the string to see
> if it uses any octal or \c escapes that have to be re-written to
> portably function as a printf format argument.
[...]

FWIW, a "printf %b" github shell code search returns ~ 29k
entries
(https://github.com/search?q=printf+%25b+language%3AShell&type=code&l=Shell)

That likely returns only a small subset of the code that uses
printf with %b inside the format and probably a few false
positives, but that gives many examples of how printf %b is used
in practice.

printf %b is also what all serious literature about shell
scripting has been recommending to use in place of the
unportable echo -e (or XSI echo, or print without -r). That
includes the POSIX standard which has been recommending using
printf instead of the non-portable echo for 30 years.

So that change will also invalidate all those. It will take a
while before %#s is supported widely enough that %b can be
safely replaced with %#s

-- 
Stephane

Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-02 Thread Stephane Chazelas

2023-09-01 23:28:50 +0200, Steffen Nurpmeso via austin-group-l at The Open 
Group:
[...]
>  |FWIW, a "printf %b" github shell code search returns ~ 29k
>  |entries
>  |(https://github.com/search?q=printf+%25b+language%3AShell&type=code&l=Sh\
>  |ell)
>  |
>  |That likely returns only a small subset of the code that uses
>  |printf with %b inside the format and probably a few false
>  |positives, but that gives many examples of how printf %b is used
>  |in practice.
> 
> Actually this returns a huge amount of false positives where
> printf(1) and %b are not on the same line, let alone the same
> command, if you just scroll down a bit it starts like neovim match
[...]

You're right, I only looked at the first few results and saw
that already gave interesting ones.

Apparently, we can also search with regexps and searching for
printf.*%b
(https://github.com/search?q=%2Fprintf.*%25b%2F+language%3AShell&type=code)
It's probably a lot more accurate. It returns ~ 19k.

(still FWIW, that's still just a sample of random code on the
internet)

[...]
> Furthermore it shows a huge amount of false use cases like
> 
>  printf >&2 "%b\n" "The following warnings and non-fatal errors were 
> encountered during the installation process:"
[...]

Yes, I also see a lot of echo -e stuff that should have been
echo -E stuff (or echo alone in those (many) implementations
that don't expand by default or use the more reliable printf
with %s (not %b)).

> It seems people think you need this to get colours mostly, which
> then, it has to be said, is also practically mislead.  (To the
> best of *my* knowledge that is.)
[...]

Incidentally, ANSI terminal colour escape sequences are somewhat
connecting those two %b's as they are RGB (well BGR) in binary
(white is 7 = 0b111, red 0b001, green 0b010, blue 0b100), with:

R=0 G=1 B=1
printf '%bcyan%b\n' "\033[3$(( 2#$B$G$R ))m" '\033[m'

(with Korn-like shells, also $(( 0b$B$G$R )) in zsh though zsh
has builtin colour output support including RGB-based).

Speaking of stackexchange, on the June data dump of
unix.stackexchange.com:

stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%b'
494

(FWIW)

Compared with %d (though that will have entries for printf(3) as well):

stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%d'
3444

-- 
Stephane

Re: cd with multiple arguments?

2010-12-13 Thread Stephane CHAZELAS

2010-12-13, 12:04(+00), Marc Herbert:
[...]
>> True, bash does not come with a vast library of ready-to-use functions.
>
> Neither is there any such "reference" library available externally. Or
> is there?

Yes, they're called "commands" or "programs" or "software".

People tend to forget that before all a shell is a command line
interpreter.

If you're finding yourself writing complex shell functions that
don't make use of external commands, then you're heading the
wrong direction with your shell, or you are heading in one
direction with the wrong medium (i.e. you need a programming
language, not a shell).

-- 
Stephane

Re: backslash required for [\/]

2010-12-26 Thread Stephane CHAZELAS

2010-12-25, 20:30(+08), jida...@jidanni.org:
[...]
> While we are here, here's another interesting phenomena,
> $ touch aa
> $ echo a[ab]
> aa
> $ echo a[ab/]
> a[ab/]
> $ echo a[ab\/]
> a[ab/]
> $ mkdir ab #doesn't help...

That's the same as in every other shell and as specified by
POSIX. So, any other behavior would be a bug.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_03

-- 
Stephane

Re: read builtin does not return success code (0) in spite of it successfully read line

2010-12-26 Thread Stephane CHAZELAS

2010-12-24, 13:20(+05), Коренберг Марк:
[...]
> mma...@mmarkk-work:/tmp$ echo aaa > /tmp/test1
> mma...@mmarkk-work:/tmp$ read x < /tmp/test1 && echo success
> success
> mma...@mmarkk-work:/tmp$ echo $x
> aaa
> mma...@mmarkk-work:/tmp$
>
> BUGGY CASE:
> mma...@mmarkk-work:/tmp$ echo -n bbb > /tmp/test2
> mma...@mmarkk-work:/tmp$ read y < /tmp/test2 && echo success
> mma...@mmarkk-work:/tmp$ echo $y
> bbb
> mma...@mmarkk-work:/tmp$
>
> So, if EOF detected but something is read successfully, please do not 
> report error !
[...]

Bash behavior is the same as every other shell, is as documented
and as specified by POSIX.

if read can't read a line (because of a read error or EOF
found before the line terminator is found) it should return a
non-zero exit status and fill the variable with the caracters
read thus far.

-- 
Stephane

Re: read not terminated with EOF/^D when -n or -N is used

2010-12-27 Thread Stephane CHAZELAS

2010-12-26, 18:19(-05), Isaac Good:
[...]
> read <^D>   -> the EOF terminates read and it exists with a false 
> exist status.
> read -n4 <^D>   -> the ^D is read by read and stored in REPLY. read 
> does not terminate.

It's not bash that handles the ^D, it's the terminal. When in
canonical mode, upon pressing ^D, the terminal emmits "EOF" (a
read(2) returns).

Though it's not documented, when using "-n", read puts the
terminal in raw mode. In that mode, the ^D is not special (it is
sent as is, so read by bash's read).

In zsh (where -n is -k), that behavior (putting the terminal in
raw mode) can be disabled by adding a "-u" flag, as in -u0 to
specify the fd to read from. It doesn't look like bash has a
similar feature, so you'd have to do:

read -n4 < <(cat)

for instance, so that read sees its stdin as a pipe and not a
terminal.

Note that when in canonical mode, the terminal only sends
characters to the application (here cat) upon pressing 
or , so read won't return after you enter 4 characters.

-- 
Stephane

Re: $? in the right side of a pipe

2010-12-27 Thread Stephane CHAZELAS

2010-12-27, 13:14(+01), :
>   Hello
>
> More information on that topic : 
>
> false ; false | echo $? stills prints 0.
> false ; (false) | echo $? prints 1
>
> So.. ? $? in the right side of a pipe is randomly the exit
> status of the left side depending of the way you write it ?
> Doesn’t sound sane.

That's not the exit status of the left side. It couldn't be as
both sides of the pipe are started at the same time and run
concurrently.

> Doesn’t that break POSIX anyway ?

I suppose it does. I can't see any reason why the above
shouldn't print 1 in any case.

> I think it should be fixed.

Me too.

Note that $PIPESTATUS doesn't seem to be affected by that bug.

$ bash -c '(exit 2) | (exit 3) ; false | echo $?, "${pipestat...@]}"'
0, 2 3

-- 
Stephane

Re: read builtin does not return success code (0) in spite of it successfully read line

2010-12-27 Thread Stephane CHAZELAS

2010-12-27, 09:43(-07), Eric Blake:
[...]
> On 12/26/2010 01:29 PM, Stephane CHAZELAS wrote:
>> Bash behavior is the same as every other shell, is as documented
>> and as specified by POSIX.
>
> POSIX requires that the input to read be a text file.  Since you aren't
> passing a text file, the behavior is undefined.  POSIX does NOT require
> bash to return failure in this case, but neither does it require bash to
> return success.  You should not rely on the behavior of read when the
> input does not end in a newline.
[...]

From
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html

SUSv4> EXIT STATUS
SUSv4>
SUSv4>  The following exit values shall be returned:
SUSv4>
SUSv4>   0
SUSv4>  Successful completion.
SUSv4>  >0
SUSv4>  End-of-file was detected or an error occurred.
SUSv4>

So I think you can expect a non-zero status here.

-- 
Stephane

Re: read builtin does not return success code (0) in spite of it successfully read line

2010-12-27 Thread Stephane CHAZELAS

2010-12-27, 11:53(-07), Eric Blake:
[...]
>> So I think you can expect a non-zero status here.
>
> All you can expect is that you have undefined results.  Undefined
> results include zero status.
[...]

Yes, sorry, you're right. Bad reasoning of mine. The >0 on EOF
is oviously on end of text file, that is when there's nothing
left to be read after the last newline character. On a terminal,
you'd express that EOF by entering the eof character on an empty
buffer.

Having said that, all the implementations are returning a
non-zero exit status on an unfinished line and it also sounds to
me like the best option (as it allows to diffentiate a
terminated line from an unfinished one and can also be seen as
an error condition), so it would be unwise IMO for  bash to
change its behavior to return 0 in this case.

-- 
Stephane

Re: Bug in shell: buffer overflow.

2011-01-01 Thread Stephane CHAZELAS

2010-12-31, 11:33(-07), Bob Proulx:
[...]
> Your expressions above are overflowing the value of your system's
> maximum integer size.  You can read the system's maximum integer size
> using getconf.
>
>   $ getconf INT_MAX
>   2147483647
[...]

POSIX requires that arithmetic expansion be using at least
signed longs, so getconf INT_MAX wouldn't necessarily be
correct.

-- 
Stephane

Re: Inconsitent treatment of * (globbing)

2011-01-31 Thread Stephane CHAZELAS

2011-01-31, 11:44(-05), Greg Wooledge:
> On Sat, Jan 29, 2011 at 12:58:56PM +0100, Pascal Wittmann wrote:
>>   But if I use the expression ".*" in bash, I would expect from the
>> behaviour of "*", that
>>   ".*" don't include the file "..". But the fact is, that it does in bash.
>
> This is done for compatibility with all the other shells.
[...]

All the other shells... but zsh and pdksh based ones (mksh,
posh...) and shells like ksh93 can be globally modified by
setting the FIGNORE environment variable (a bug and potential
security vulnerability if you ask me).

I agree with the OP that .* expanding "." and ".." is not
useful but I suspect it is required by POSIX.

-- 
Stephane

Re: Document what's allowed in function names

2011-05-24 Thread Stephane CHAZELAS

2011-04-11, 09:51(-04), Chet Ramey:
>> Machine Type: i686-pc-linux-gnu
>> 
>> Bash Version: 4.2
>> Patch Level: 8
>> Release Status: release
>> 
>> Description:
>>  man bash is currently lacking information on what is allowed for 
>> function
>>  names. It implies name with name () compound-command [redirection] and
>>  at the start of the manual there is:
>> 
>>  name   A word consisting only of alphanumeric characters and 
>> underscores,
>>  and beginning with an alphabetic character or an underscore.  Also 
>> referred
>>  to as an identifier.
>> 
>>  In reality the rules for function names are much more loose. For example
>>  hyphen and forward slash are allowed. But there are still some 
>> restrictions:
>>  bash: `aa\'foo': not a valid identifier
>
> It was a mistake to allow such characters in function names (`unset' doesn't
> work to unset them without forcing -f, for instance).  We're stuck with them
> for backwards compatibility, but I don't have to encourage their use.
[...]

Why would you put any restriction on the allowed name of a function?

At calling time, it's the same namespace as an argv[0] so any
arg so could be any string.

In zsh, any string is allowed including the empty string (and
allowing the NUL character as that is allowed in arguments to
functions or built-in commands).

$ ''()echo foo
$ ""
foo
$ /bin/ls() echo bar
$ /bin/ls
bar

-- 
Stephane

Re: Document what's allowed in function names

2011-05-25 Thread Stephane CHAZELAS

2011-05-24 17:23:20 -0400, Chet Ramey:
[...]
> > Why would you put any restriction on the allowed name of a function?
[...]
> Because Posix does, and because unset without -f has to enforce the variable
> name restrictions.  (Though the language has been relaxed in the latest
> standard.)
[...]

POSIX puts restrictions on the name of functions used by
applications, but I can't see anywhere it restricting what the
shell allows.

'foo bar'() {
  baz
}

being not a valid POSIX syntax, I can't imaging POSIX forcing a
shell to do this (define a "foo bar" function) or that (return
an error message) upon reading it.

I may be wrong though, is there any specific section of the
standard you had in mind?

Note that all of pdksh, mksh, zsh (at least) allow
foo+bar() { w; } for instance
ksh93 allows foo.get() { w; }

I also find it unfortunate that bash doesn't allow

foo() bar

given that every other Bourne-like shell (including the Bourne
shell) supports it (again, it's not POSIX syntax, so bash
doesn't have to implement it, 

-- 
Stephane

usenet to email gateway

2011-08-10 Thread Stephane Chazelas

Hiya,

not sure who to report that to, but it looks like the usenet to
mailing-list gateway at least for gnu.bash.bug is not working
properly. That is, messages posted to the newsgroup are not
sent to the corresponding mailing list.

Compare for instance
http://thread.gmane.org/gmane.comp.shells.bash.bugs/16950
(mailing list) with
http://groups.google.com/group/gnu.bash.bug/browse_thread/thread/6358da68e2c8d4e3
(usenet)

All my messages (like
http://groups.google.com/group/gnu.bash.bug/msg/a458eb453deb9fce?dmode=source)
are missing on gmane.

Best regards,
Stephane

Re: usenet to email gateway

2011-08-11 Thread Stephane CHAZELAS

2011-08-10 23:05:26 +, Karl Berry:
> not sure who to report that to
> 
> I've asked the FSF sysadmins.  I can see from the mailing list
> configuration that the gateway is intended to be operational, but don't
> know how to debug what it happening from there.
[...]

Thanks Karl.

My last successful post to gnu.bash.bug was on
2011-04-12 15:30:20 GMT

http://thread.gmane.org/gmane.comp.shells.bash.bugs/16603

and Chet's answer suggests it made its way to the mailing list
only months later.

I'll repost my messages to the mailing list and try and cancel
them on usenet. In the meantime, it might be worth pointing out
that posting to the gnu.bash.bug newsgroup doesn't work (well it
works, but only usenet users see the messages, not those
subscribed to the mailing list only).

-- 
Stephane

Re: Bash source repository

2011-08-11 Thread Stephane CHAZELAS

2011-05-30, 06:34(+00), Michael Witten:
[...]
> Ben, it is generally a good idea to maintain the `Cc' list unless
> explicitly asked.

Beware that the bash mailing list also has a usenet interface
(gnu.bash.bug) which has no notion of recipients let alone Cc.

-- 
Stephane

Re: bug: return doesn't accept negative numbers

2011-08-11 Thread Stephane CHAZELAS

2011-08-08, 13:55(-07), Linda Walsh:
[...]
> and both 'exit' and 'return' should return error "ERANGE" if "--posix" is
> set, and -1 is given.  Iinvalid option doesn't make as much sense, in
> this situtation, if it was -k or -m, sure...but in this case, it's a fact
> that --posix artificially limits exit values apart from what is allowed in
> most prog langs (which accept negative, but still return results &0xff),
> so for Posix, it's a matter of disallowing a 'normal range', vs. it being
> an invalid option
[...]

POSIX doesn't prevent a shell from accepting -1 (or for doing
anything like eject a cd or output an error or turn red upon
"return -1"). It just says an *application* should not use
"return -1", that is that if one wants to write a portable
script, she shouldn't use "return -1".

Many POSIX shells accept "return -1"

$ ksh93 -c 'f() return -1; f; echo $?'
255
$ pdksh -c 'f() return -1; f; echo $?'
-1
$ zsh -c 'f() return -1; f; echo $?'
-1
$ posh -c 'f() return -1; f; echo $?'
return: invalid option -- '1'
1
$ posh -c 'f() return -- -1; f; echo $?'
-1
$ mksh -c 'f() return -1; f; echo $?'
mksh: return: -1: unknown option
1
$ mksh -c 'f() return -- -1; f; echo $?'
-1

But as you can see the result varies, so one shouldn't use
"return -1" if one wants to be portable accross POSIX shells.

Also note:

$ zsh -c 'f() return -1; f; echo $?'
-1
$ zsh -c 'f() return -1; (f); echo $?'
255

That is even in shells that support arbitrary numbers for
return, as soon as they are cast to exit status, they are
&255ed.

-- 
Stephane

Re: feature request: option to start script not from the very beginning of file

2011-08-11 Thread Stephane CHAZELAS

2011-08-02, 17:05(+04), Dmitry Bolshakov:
[...]
> perl has "-x" switch which makes it skip leading file contents until the
> #!/bin/perl
> line
>
> imho it would be good to have the same feature in bash

What about:

#! /bin/bash -
:||:<<\#__END__

whatever you like here

#__END__

your script starts here

-- 
Stephane

Re: equivalent of Linux readlink -f in pure bash?

2011-08-11 Thread Stephane CHAZELAS

2011-08-9, 11:44(+10), Jon Seymour:
> Has anyone ever come across an equivalent to Linux's readlink -f that
> is implemented purely in bash?
>
> (I need readlink's function on AIX where it doesn't seem to be available).
[...]

What about:

readlink_f() (
  link=$1 max_iterations=40
  while [ "$max_iterations" -gt 0 ]; do
max_iterations=$(($max_iterations - 1))
dir=$(dirname -- "$link") || exit
base=$(basename -- "$link") || exit
dir=$(cd -P -- "$dir" && pwd -P) || exit
link=${dir%/}/$base
if [ ! -L "$link" ]; then
  printf '%s\n' "$link"
  exit
fi
link=$(ls -ld -- "$link") || exit
link=${link#* -> }
  done
  printf >&2 'Loop detected\n'
  exit 1
)

-- 
Stephane

Re: [OT] Re: feature request: option to start script not from the very beginning of file

2011-08-11 Thread Stephane CHAZELAS

2011-08-02, 23:41(+02), mhenn:
> Am 02.08.2011 15:55, schrieb Stephane CHAZELAS:
[...]
>> What about:
>> 
>> #! /bin/bash -
>> :||:<<\#__END__
>> 
>> whatever you like here
>> 
>> #__END__
[...]
> Why did you use :||:<< ...
> and not just :<<... ?
>
> when testing it, it doesn't make any difference:
[...]

Without ":||", it does create a temporary file, fill it and set
it as standard input to the ":" command. With :||:, that does
not happen because the second ":" is not run since the first one
succeeded.

$ strace bash -c $':<<\\E\nE' |& grep tmp
stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=12288, ...}) = 0
stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=12288, ...}) = 0
access("/tmp", W_OK)= 0
statfs("/tmp", {...}) = 0
open("/tmp/sh-thd-1312368098", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC, 0600) = 3
open("/tmp/sh-thd-1312368098", O_RDONLY) = 4
unlink("/tmp/sh-thd-1312368098")= 0
$ strace bash -c $':||:<<\\E\nE' |& grep tmp
$ 

-- 
Stephane

Re: equivalent of Linux readlink -f in pure bash?

2011-08-11 Thread Stephane CHAZELAS

2011-08-9, 09:24(+00), Stephane CHAZELAS:
> 2011-08-9, 11:44(+10), Jon Seymour:
>> Has anyone ever come across an equivalent to Linux's readlink -f that
>> is implemented purely in bash?
>>
>> (I need readlink's function on AIX where it doesn't seem to be available).
> [...]
>
> What about:
>
> readlink_f() (
>   link=$1 max_iterations=40
>   while [ "$max_iterations" -gt 0 ]; do
> max_iterations=$(($max_iterations - 1))
> dir=$(dirname -- "$link") || exit
> base=$(basename -- "$link") || exit
> dir=$(cd -P -- "$dir" && pwd -P) || exit
> link=${dir%/}/$base
> if [ ! -L "$link" ]; then
>   printf '%s\n' "$link"
>   exit
> fi
> link=$(ls -ld -- "$link") || exit
> link=${link#* -> }
>   done
>   printf >&2 'Loop detected\n'
>   exit 1
> )

Sorry, it's wrong if there are relative paths in symlinks (or
trailing newlines).

fixed_cmd_subst() {
  eval '
'"$1"'=$('"$2"'; ret=$?; echo .; exit "$ret")
set -- "$1" "$?"
'"$1"'=${'"$1"'%??}
  '
  return "$2"
}

readlink_f() (
  link=$1 max_iterations=40
  while [ "$max_iterations" -gt 0 ]; do
max_iterations=$(($max_iterations - 1))
fixed_cmd_subst dir 'dirname -- "$link"' || exit
fixed_cmd_subst base 'basename -- "$link"' || exit
cd -P -- "$dir" || exit
link=${PWD%/}/$base
if [ ! -L "$link" ]; then
  printf '%s\n' "$link"
  exit
fi
fixed_cmd_subst link 'ls -ld -- "$link"' || exit
link=${link#* -> }
  done
  printf >&2 'Loop detected\n'
  exit 1
)

-- 
Stephane

Re: equivalent of Linux readlink -f in pure bash?

2011-08-11 Thread Stephane CHAZELAS

2011-08-09, 11:29(+02), Bernd Eggink:
> On 09.08.2011 03:44, Jon Seymour wrote:
>> Has anyone ever come across an equivalent to Linux's readlink -f that
>> is implemented purely in bash?
>
> You can find my version here:
>
>   http://sudrala.de/en_d/shell-getlink.html
>
> As it contains some corrections from Greg Wooledge, it should handle 
> even pathological situations. ;)
[...]

> function getlink # ([-l] path)

Why use the ksh syntax instead of the standard one?

> {
> # Path of the file a symbolic link is pointing to.
> # -l: follow link chain, print last target
> # no option: print 1st target unchanged
> 
> typeset dir file last link opt oldPWD=$PWD ret=0
> 
> (( OPTIND = 1 ))
> 
> while getopts "l" opt
> do
> case $opt in
> (l) last=1
> ;;
> esac
> done
> 
> shift $(( OPTIND - 1 ))
> file=$1
> 
> if [[ $last ]]  # last link
> then
> while true
> do
> dir=$(dirname "$file")

What if $file starts with "-"?

> [[ ! -d $dir ]] && 
> {
> ret=1
> break
> }
> 
> # remove slashes at end
> 
> while [[ $file == */ ]]
> do
> file=${file%/}
> done

What if $file is "/"

> file=${file##*/}# file name
> command cd -P "$dir"

What if that command fails?

> [[ ! -h $file ]] && break

What if there are symlinks in the path components?

> link=$(command ls -l -- "$file"; printf x)

Good point about command substitution discarding trailing
newlines. I forgot that in the solution I gave.

> link=${link%$'\nx'}
> remove="$file -> "
> file=${link#*"$remove"}
> done
> 
> printf "%s\n" "$PWD/$file"
> command cd $oldPWD

What if $oldPWD has blanks or wildcards?


> elif [[ ! -h $file ]]   # 1st link
> then
> printf "%s\n" "$file"
> else
> link=$(ls -l "$file")
> printf "%s\n" "${link##*-> }"

What about trailing newlines here and links with "-> " in their
path.

> fi
> 
> return $ret

What if $IFS contains 0 or 1?

> }
> 
> declare -fc getlink

What's the -c about?

-- 
Stephane

Re: equivalent of Linux readlink -f in pure bash?

2011-08-11 Thread Stephane CHAZELAS

2011-08-10, 12:00(+02), Bernd Eggink:
[...]
> function f
> {
>  local OPTIND=1
>
>  echo "\$1=$1"
> }
>
> while getopts "abcdefg" opt
> do
>  echo "opt=$opt"
>  f $opt
> done
> 
>
> Calling the sript like this works fine:
>   script -a -b -c
>
> But calling it like this leads to an endless loop:
>   script -abc
[...]
> However, this also loops endlessly. The reason is most likely that bash 
> maintains an additional internal variable holding the index of the 
> current character, relative to the current word. While this variable is 
> not directly accessible by the user, it is set to 0 whenever OPTIND is 
> assigned a value.
[...]

That would be a bug in bash in my opinion. If OPTIND is marked
local to the function, it shouldn't affect the behavior of
parent contexts.

Note that that bug is also in ksh93, pdksh, mksh and posh
(though slightly different in that one), but not in ash nor zsh.

Note that if you set OPTIND local, you probably want to do the
same for OPTARG (and maybe OPTERR).

-- 
Stephane

Re: equivalent of Linux readlink -f in pure bash?

2011-08-11 Thread Stephane CHAZELAS

2011-08-09, 09:50(-04), Steven W. Orr:
[...]
> *) To remove the trailing slashes, instead of
>
>  while [[ $file == */ ]]
>  do
>  file=${file%/}
>  done
>
>  file=${file##*/}# file name
>
> just say
>   file="${file%${file##*[!/]}}"
[...]

file=${file%"${file##*[!/]}"}

Same problem with "/" being changed to "" though.

-- 
Stephane

Re: How to match regex in bash? (any character)

2011-09-29 Thread Stephane CHAZELAS

2011-09-29, 13:52(-08), Roger:
[...]
> Since you're saying the regex description is found within either regex(3) or
> regex(7), couldn't there be a brief note within the Bash Manual Page be 
> something
> to the effect:
[...]

No, it's not.

I suppose bash could say: See your system regex(3)
implementation documentation for the description of extended
regular expression syntax on your system. That syntax should be
compatible with one version or the other of the POSIX Extended
Regular Expression syntax whose specification for the latest
version as of writing can be found at:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04

regex(3) points to the API (regex.h), how the system documents
the regexps covered by that API is beyond bash knowledge.

-- 
Stephane

gnu.bash.bug usenet interface not working again

2011-10-02 Thread Stephane CHAZELAS

It worked for a while after I reported the issue and then it
stopped again. That's very frustrating.

-- 
Stephane

Re: initialisation bash variables

2011-10-02 Thread Stephane CHAZELAS

2011-08-15, 17:15(+02), Francky Leyn:
> Hello,
>
> if you have a variable, say VAR,
> and you don't assign it a value,
> and afterwards you test it,
> what is the value of $VAR then?
>
> random, or an empty string?
[...]

Upon startup, the shell makes one shell variable per environment
variable whose name is compatible with shell variable names.

So for instance, if bash receives "VAR=foo" in its environemt,
$VAR will expand to foo. If it's passed "1=bar", $1 will not be
affected, and it's the same for a few special variables of the
shell.

If passed "A+B=C" or "=D" for instance, that obviously won't be
mapped to shell variables. Some shells do discard variables from
the environment that can't be mapped to shell variables. That's
not the case of bash.

-- 
Stephane

Re: initialisation bash variables

2011-10-02 Thread Stephane CHAZELAS

2011-08-16, 22:24(+02), Francky Leyn:
[...]
> VAR=FALSE
> # some command line procesing, that can set VAR to "TRUE"
> if [ $VAR = TRUE ]; then
> ...
> fi
>
> Must I effectively write that VAR=FALSE?
> Or will the script work fine without?

Yes, you must write it, because bash may inherit a VAR variable
from the environment like I said (especially when you consider
that all uppercase variables are by convention reserved for
environment variables).

> Also, can't I write the test as
>
> if [ $VAR ]; then
> ...
> fi
[...]

No. That syntax is wrong.

Valid syntaxes are:

if [ "$VAR" != "" ]

if [ -n "$VAR" ]

if [ "$VAR" ]

Or if you want to be extremely portable:

if [ "" != "$VAR" ]
or
if [ "x$VAR" != x ]

Personally, I prefer:

var=false
if ... var=true ...

if "$var"; then
  ...
fi

-- 
Stephane

Re: initialisation bash variables

2011-10-02 Thread Stephane CHAZELAS

2011-08-17, 08:32(+02), Francky Leyn:
> On 8/16/2011 10:53 PM, Stephane CHAZELAS wrote:
>> 2011-08-16, 22:24(+02), Francky Leyn:
>> [...]
>>> VAR=FALSE
>>> # some command line procesing, that can set VAR to "TRUE"
>>> if [ $VAR = TRUE ]; then
>>> ...
>>> fi
>>>
>>> Must I effectively write that VAR=FALSE?
>>> Or will the script work fine without?
>>
>> Yes, you must write it, because bash may inherit a VAR variable
>> from the environment like I said (especially when you consider
>> that all uppercase variables are by convention reserved for
>> environment variables).
>
> 1) So it's a bad idea to use uppercase variables in a script?

Yes, unless you want to export them to the environment of
commands you start in that script.

> 2) If VAR coincides with an environment variable, and in the
> script I change it value, is this then propagated to outside
> the script? Is the environment variable affected?

The environment is a list of strings (by convention of the format
var=value) passed upon executing a command in a fashion exactly
similar to the list of arguments to the command.

In other words, when you execute a command:

cmd arg1 arg2

you pass both a list of arguments ("cmd", "arg1", and "arg2")
and a list of environment variables ("PATH=/bin...",
"VAR=foo"...).

The difference being that the list of arguments is explicit on
the shell command line while the list of environment variables
comes from the remembered list of environment variables that the
shell (or any application that uses the C library and *environ,
putenv(3), setenv(3)) maintains internally (and initialised from
the environment it received when it was executed).

Though the shell allows the syntax:
VAR=value cmd arg1 arg2
to specify environment variables on the command line.

In no circumstance are variable definitions in one process going
to affect the environment of other processes (an exception to
that is the "fish" shell)

-- 
Stephane

Re: initialisation bash variables

2011-10-02 Thread Stephane CHAZELAS

2011-08-18, 04:10(+02), Patrick:
> On 17.08.2011 20:53, Stephane CHAZELAS wrote:
>> 2011-08-17, 08:32(+02), Francky Leyn:
>>> On 8/16/2011 10:53 PM, Stephane CHAZELAS wrote:
>>> 2) If VAR coincides with an environment variable, and in the
>>>  script I change it value, is this then propagated to outside
>>>  the script? Is the environment variable affected?
>>
>> The environment is a list of strings (by convention of the format
>> [...]
>>
>> In no circumstance are variable definitions in one process going
>> to affect the environment of other processes (an exception to
>> that is the "fish" shell)
>>
>
> Could it be that the two of you are not talking about the same thing?
>
> Just for clarity: environment variables (henceforth "the environment") 
> of a process are (is) inherited by its children.

Everything is inherited by children, however upon executing a
command using the execve(2) system call, all the memory is
of a process is reinitialised. What the new command gets passed
along the execve(2) system call (as arguments) is a list of
arguments (argv) and a list of environment variables (envp).

And as I said, by convention (and helped in that way by C
library function wrappers around the execve(2) system call
(execv, execl, system...) that take care of propagating the
environment) that envp is built by the application from the envp
it received when it was executed (in that process or its
parents).

So yes, generally, environment generally is inherited by
commands executed in children processes but also by the current
process.

> Therefore, what *does* happen, is that if Stephane, as in 2), changes 
> VAR in script, the change gets propagated to the scripts *child* processes.

I think it brings confusion to speak of processes here.
Everything is propagated upon a fork() (the system call that
creates a child process), a fork creates an exact same copy of
the current process. The environment is something that concerns
command execution.

As a side note though, that behavior didn't occur in the Bourne
shell. In the bourne shell, you had to explicitely export a
variable (even if the shell received itself in its environment),
for it to be exported to the commands executed by the shell.

$ VAR=foo sh -c 'VAR=bar; env' | grep VAR
VAR=foo
$ VAR=foo sh -c 'VAR=bar; export VAR; env' | grep VAR
VAR=bar

[...]
> But what does of course not happen, is that the change would get 
> propagated to the *parent* process.

Or any other process.

environment changes are propagated to children just like the
rest of the memory and generally to commands executed by the
current process or any of those children

> (What is the "fish" shell ???)

The friendly interactive shell.

http://en.wikipedia.org/wiki/Friendly_interactive_shell

And see http://fishshell.com/user_doc/index.html#variables for
the documentation on the scope of its variables.

-- 
Stephane

Re: Syntax Question...

2011-10-02 Thread Stephane CHAZELAS

2011-08-14, 02:43(+00), Michael Witten:
[...]
>> Please read BashFAQ/006: http://mywiki.wooledge.org/BashFAQ/006
>
> "no force in the universe can put NUL bytes into shell strings usefully"
>
> Ain't that the goddamn Truth!

No, zsh supports NUL bytes in its strings happily. It's even in
the default $IFS. Where NUL bytes can't go, it's in arguments
to commmands, environment variables, filenames... But I can't
see why a shell variable couldn't contain NUL bytes, it's even a
good thing for both those reasons as you can use that character
to safely separate filenames, arguments, env vars... See for
instance the -0 option of many GNU utilities.

-- 
Stephane

Re: initialisation bash variables

2011-10-02 Thread Stephane CHAZELAS

2011-08-18, 12:44(+02), Patrick:
[...]
>> $ VAR=foo sh -c 'VAR=bar; env' | grep VAR
>> VAR=foo
>> $ VAR=foo sh -c 'VAR=bar; export VAR; env' | grep VAR
>> VAR=bar
> Interresting! I do not have the bourne shell installed. Bash tries to 
> mimic it when called as "sh", but it does not produce the "correct" 
> result for your first example. Not that I would mind about that though. 
> Busybox btw. also yields "VAR=bar". Phew... Dash as well. There is no 
> "sh" package in the ubuntu repos. Google also has no quick answer. What 
> kind of "sh" are you actually using if I may ask ?

Yes, that was the behavior of the Bourne shell, it was changed
by the Korn shell and every other Bourne-like shell followed,
and the new behavior is now specified by POSIX.

The Bourne shell is a shell written by Steve Bourne in the late
seventies and is the ancestor of all of nowadays "Bourne-like
shells" (ash, dash, ksh88, ksh93, pdksh, posh, mksh, bash,
zsh...). There have been many variants of the Bourne shell with
modifications added by the various Unix vendors.

It's still found for backward compatibility in some commercial
Unices. Nowaday, "sh" refers to an implementation or another of
a shell that is able to interpret a POSIX script as specified
(just to avoid saying a "POSIX shell" which would be too much of
a shortcut). The Bourne shell is not one of them (for the reason
above and many others). The code of the Bourne shell was
released as opensource  /recently/ as part of OpenSolaris, so
you can now find ports of it to Linux (See heirloom-sh for
instance). You can also run the Bourne shell from UnixV7 (the OS
where it was first released in 1979) in a PDP11 emulator, and
you'll notice a few differences between the two.

See 
http://www.in-ulm.de/~mascheck/bourne/
for a reference on the Bourne shell.

-- 
Stephane

Re: Syntax Question...

2011-10-02 Thread Stephane CHAZELAS

2011-08-17, 08:24(-04), Greg Wooledge:
> On Tue, Aug 16, 2011 at 03:41:19PM -0700, Linda Walsh wrote:
>> Ken Irving wrote:
>> >Maybe this?
>> >today_snaps=( ${snap_prefix} )
>
>>   but as you mention, that will put them into an arraysorry "imprecise
>> terminology" list for me is some number of objects in a string 
>> separated by some
>> separator.
>
> This is an extremely bad idea.  Legacy Bourne shell code from the
> 1980s kind of bad -- from the horrible days before we *had* arrays
> in shells.  How are you going to handle filenames with spaces in them?
> With newlines in them?  With commas in them?  With colons in them?  Tabs?
> DEL characters?  Those are all valid in filenames.  Any delimiter you
> can *put* in a shell string is also a valid character in a filename (or
> at least in a pathname, which eliminates the possibility of using slash).
>

In this code:

today_snaps=( ${snap_prefix} )

With the default value of IFS in bash and without globbing
disabled, the problematic characters are SPC, TAB, NL, *, ?, [
and potentially more if you have extended globbing enabled.

If $snap_prefix is meant to be space delimited, then you can
make it a bit safer by doing:

IFS=" "
set -f
today_snaps=( $snap_prefix )

NL is a good delimited because it's rare in filenames (but are
allowed, so if the data is foreign and security is a concern,
not an option) and you can also pass the list to line-based
(text_ utilities

var='a1
a2 b2'

IFS='
'
set -f
set -- $var

Or a_vars=$(printf '%s\n' "$var" | grep '^a')

-- 
Stephane

[OT] Re: accents

2011-10-02 Thread Stephane CHAZELAS

2011-08-25, 12:19(-07), Linda Walsh:
[...]
> ` Greg Wooledge wrote:
>> On Wed, Aug 24, 2011 at 06:51:32PM -0700, Linda Walsh wrote:
>>   
>>> BTW, Thomas -- what is the Character that comes after 'De' in your
>>> name?  I read it as hex '0xc282c2'  which doesn't seem to be valid unicode.
>>> 
>>
>> RFC 2822 (section 2.2) says that Header Fields in an email must be
>> composed of US-ASCII characters, so there's no telling what sort of
>> problems the multi-byte character in his From: header may be triggering
>> as it passes through various mail transfer agents.
>>   
> Well, on one level, I would agree,
> But on another, RFC 2822 is obviously messed up, since domain names can 
> contain UTF-8 characters..
>
>
> So...um...how does that work?
[...]

See RFC 5335

-- 
Stephane

Re: Using TMOUT as default for read bultin is unwise

2011-10-02 Thread Stephane CHAZELAS

2011-09-14, 09:46(+01), Wheatley, Martin R:
[...]
> Description:
>   The contents of the environment variable TMOUT are used are the
>   default timeout for the read builtin command when reading from
>   a TTY or pipe AND as a shell command input idle time.
>
>   This can lead to random failures of shell scripts
[...]
>   I think the TMOUT should not be overloaded and its use as a default 
> value
>   for the read builtin - especially for sub-shell pipelines is dangerous
>   and should be discontinued otherwise all bash scripts that use the read 
> builtin
>   need to be modified to include TMOUT=0.

That's not the only problematic variable. See also
http://groups.google.com/group/comp.unix.shell/browse_thread/thread/cf7d5147dd829cf9/ef5b5b49a676b99d#ef5b5b49a676b99d

And here is what Geoff Glare from the Austin Group (the body
behind POSIX) had to say when I raised it some time ago:
http://groups.google.com/group/comp.unix.shell/browse_thread/thread/60c3e67919c36d0a/25ab970d275ecdb7#25ab970d275ecdb7

In short: if one sets a TMOUT environment variable, the easy
fix is to tell them: DON'T!

-- 
Stephane

Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.

2011-10-02 Thread Stephane CHAZELAS

2011-09-16, 17:17(-07), William Park:
> 145557834293068928043467566190278008218249525830565939618481
> is awfully big number! :-)

3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you
can represent with 64bits, nor any reasonable number of bits, 
not (3**2)**62.

Certainly not a number that bash arithmetic expansion can handle
not even in floating mode.

Wih zsh:
$ echo $((exp((2**62)*log(3
inf.

$ echo 'e((2^62)*l(3))' | bc -l
Runtime warning (func=e, adr=123): scale too large, set to 2147483647
Fatal error: Out of memory for malloc.

-- 
Stephane

Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.

2011-10-02 Thread Stephane CHAZELAS

2011-09-17, 13:06(+00), Stephane CHAZELAS:
> 2011-09-16, 17:17(-07), William Park:
>> 145557834293068928043467566190278008218249525830565939618481
>> is awfully big number! :-)
>
> 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you
> can represent with 64bits, nor any reasonable number of bits, 
> not (3**2)**62.
[...]

Sorry, my bad,

3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to
most other places (ksh93, bc, python, gawk, perl, ruby...).

-- 
Stephane

Re: How to match regex in bash? (any character)

2011-10-02 Thread Stephane CHAZELAS

2011-10-1, 14:39(-08), rogerx@gmail.com:
[...]
> I took some time to examine the three regex references:
>
> 1) 
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04
> Written more like a technical specification of regex.  Great if your're
> going to be modifying the regex code.  Difficult to follow if you're new,
> looking for info.

One thing to bear in mind is that bash calls a system library to
perform the regexp expansion (except that [*]), so it can't
really document how it's gonna work because it just can't know,
it may differ from system to system. The only thing that is more
or less guaranteed is that all those various implementation
should comply to that specification.

Above is the specification of the POSIX extended regular
expression, so a bash script writer should refer to that
document if he want to write a script for all the systems where
bash might be used.

> 2) regex(7)
> Although it looks good, upon further examination, I start to see run-on
> sentences.  It's more like a reference, which is what a man file should 
> be.
> At the bottom, "AUTHOR - This page was taken from Henry Spencer's regex
> package"

On the few systems where that man page is available, it may or
may not document the extended regular expressions that are
used when calling the regex(3) API (on my system, it doesn't).
Those regular expressions may or may not have extensions over
the POSIX API, and that document may or may not point out which
ones are extensions and which one are not, so a script writer may
be able to refer to that document if he wants his script to work
on that particular system (except that [*]).

> 3) grep(1)
> Section "REGULAR EXPRESSIONS".  At about half the size of regex(7), the
> section clearly explains regex and seems to be easily understandable for a
> person new to regex.

That's another utility that may or may not use the same API, in
the same way as bash or not. You get no warranty whatsoever that
the regexps covered there will be the same as bash's.

[*] actually, bash does some (undocumented) preprocessing on the
regexps, so even the regex(3) reference is misleading here.

For instance, on my system the regex(3) Extended REs support \1
for backreference, \b for word boundary, but when calling
[[ aa =~ (.)\1 ]], bash changes it to [[ aa =~ (.)1 ]] (note
that (.)\1 is not a portable regex as the behavior is
unspecified) bash won't behave as regex(3) documenta on my
system.

Also (and that could be considered a bug), "[\a]" is meant to
match either "\" or "a", but in bash, because of that
preprocessing, it doesn't:

$ bash -c '[[ "\\" =~ [\a] ]]' || echo no
no
$ bash -c '[[ "\\" =~ [\^] ]]' && echo yes
yes

Once that bug is fixed, bash should probably refer to POSIX EREs
(since its preprocessing would disable any extension introduced
by system libraries) rather than regex(3), as that would be more
accurate.

The situation with zsh:
  - it uses the same API as bash (unless the RE_MATCH_PCRE
option is set in which case it uses PCRE regexps)
  - it doesn't do the same preprocessing as bash because...
  - it doesn't implement that confusing business inherited from
ksh whereby quotes RE characters are taken literally.

  So, in zsh
  - [[ aa =~ '(.)\1' ]] works as documented in regex(3) on my
system (but may work differently on other systems as the
behavior is unspecified as per POSIX).
  - [[ '\' =~ '[\a]' ]] works as POSIX specifies
  - after "setopt RE_MATCH_PCRE", one gets a more portable
behavior as there is only one PCRE library (thouh different
versions).

The situation with ksh93:
  - Not POSIX either but a bit more consistent:
$ ksh -c '[[ "\\" =~ [\a] ]]' || echo no
no
$ ksh -c '[[ "\\" =~ [\^] ]]' || echo no
no
  - it implements its own regexps with its own many extensions
which therefore can be and are documented in its man page
but are not common to any other regex (though are mostly a
superset of the POSIX ERE).

-- 
Stephane

Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.

2011-10-02 Thread Stephane CHAZELAS

2011-09-19, 09:27(-04), Chet Ramey:
> On 9/16/11 4:39 PM, Nicolas ARGYROU wrote:
>
>> Bash Version: 4.0
>> Patch Level: 33
>> Release Status: release
>> 
>> Description:
>> The algorithm used to calculate x to the power of y: x**y
>> takes O(y) time which is way too long on systems using 64 bits.
>> Calculating for exemple $((3**2**62)) freezes the shell at
>> argument parsing time.
>> 
>> Repeat-By:
>> bash -c 'echo $((3**2**62))'
>> 
>> Fix:
>> This fix uses an alorithm that takes O(log(y)) time, which is way
>> faster. But it is still about 30 times slower with random numbers
>> than a single multiplication, on 64 bits systems. The fix is written
>> as a C++ template working on any unsigned integer type, and doesn't
>> need any external resource:
>
> Thanks for the report.  This looks like an independent reimplementation of
> the "exponentiation by squaring" method.  I did a little looking around,
> and it's the best algorithm out there.  I used a slightly different but
> equivalent implementation.
[...]

FYI, ksh93 uses pow(3). So does zsh, but only in floating point
mode.

Probably better and more efficient than reinventing the wheel.

-- 
Stephane

Re: List of background processes in a command group, in a pipeline, executed sequentially under certain conditions.

2011-10-02 Thread Stephane CHAZELAS

2011-10-01, 06:54(-05), Dan Douglas:
[...]
> f() {
> local -i x y
> while read -rN1 "x[y++]"; do
> printf '%d ' "${1}" >&2# keep track of which job this is.
> done
> printf "${#x[@]} " # Print the total number of reads by each 
> job.

if you add a echo >&2 "[done $1]" here.
> }
>
> g() {  # Used in ex 6
> f 1 <${1} &
> f 2 <${1}
> }
>
> # This works as I expect, f is backgrounded and two readers of one pipe each 
> get about half the input:
> exincr # 1
>
> read -ra x < <({ f 1 & f 2; } < <(zeros))
> printf '%b\n' "\n${x[@]}\n"
>
> # Equivalent to above, except with piped output. Now f is not backgrounded. 
> One reader consumes all the input:
> exincr # 2
>
> { f 1 & f 2; } < <(zeros) | {

You'll notice that f 1 terminates straight away. And if you do a
strace, you'll notice that bash does a dup2(open("/dev/null"),
0), that is redirecting "f 1"'s stdin to /dev/null.

~$ bash -c '{ cat; } < c | cat'
test
~$ bash -c '{ cat & } < c | cat'
~$ bash -c '{ lsof -ac lsof -d0; } < c | cat'
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
lsof5005 chazelas0r   REG  253,25 58785638 /home/chazelas/c
~$ bash -c '{ lsof -ac lsof -d0 & } < c | cat'
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
lsof5010 chazelas0r   CHR1,3  0t0  973 /dev/null

That behavior is required by POSIX and occurs for ash and pdksh
and its derivatives as well:

POSIX> command1 & [command2 & ... ]
POSIX> 
POSIX>   The standard input for an asynchronous list, before any
POSIX>   explicit redirections are performed, shall be considered to
POSIX>   be assigned to a file that has the same properties as
POSIX>   /dev/null. If it is an interactive shell, this need not
POSIX>   happen. In all cases, explicit redirection of standard input
POSIX>   shall override this activity.

However, I don't know why bash does it only in the "pipe" case.

~$ ash -c '{ lsof -ac lsof -d0 & } < c'
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
lsof5188 chazelas0r   CHR1,3  0t0  973 /dev/null
~$ bash -c '{ lsof -ac lsof -d0 & } < c'
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
lsof5191 chazelas0r   REG  253,25 58785638 /home/chazelas/c

To work around, this <&0 trick seems to work:

~$ bash -c '{ lsof -ac lsof -d0 <&0 & } < c | cat'
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
lsof5247 chazelas0r   REG  253,25 58785638 /home/chazelas/c

-- 
Stephane

Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.

2011-10-02 Thread Stephane CHAZELAS

2011-09-17, 13:39(+00), Stephane CHAZELAS:
> 2011-09-17, 13:06(+00), Stephane CHAZELAS:
>> 2011-09-16, 17:17(-07), William Park:
>>> 145557834293068928043467566190278008218249525830565939618481
>>> is awfully big number! :-)
>>
>> 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you
>> can represent with 64bits, nor any reasonable number of bits, 
>> not (3**2)**62.
> [...]
>
> Sorry, my bad,
>
> 3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to
> most other places (ksh93, bc, python, gawk, perl, ruby...).

Sorry again,

I was right in the first place,

3**2**62 is 3**(2**62) in bash and zsh like in other shells.

I think I need more sleep...

-- 
Stephane

Re: How to match regex in bash? (any character)

2011-10-03 Thread Stephane CHAZELAS

2011-10-02, 21:51(-04), Chet Ramey:
> On 10/2/11 3:43 PM, Stephane CHAZELAS wrote:
>
>> [*] actually, bash does some (undocumented) preprocessing on the
>> regexps, so even the regex(3) reference is misleading here.
>
> Not really.  The words are documented to undergo quote removal, so
> they undergo quote removal.  That turns \1 into 1, for instance.
[...]

The problem and confusion here comes from the fact that "\" is
overloaded and used by two different pieces of software (bash
and the system regex).

It is used:
  - by bash for quoting
  - by regex(3) to escape regexp characters in some
circumstances (for instance when not inside [...], but it may
vary per implementations (think of the (?{...} type extensions))
  - by some regex(3) implementations to introduce new regexp
operators (\w, \b, \<...)

BTW, another bug:

$ bash -c '[[ "\\" =~ ["."] ]]' && echo yes
yes

And what one could consider a bug:

~$ bash -c 'chars="a]"; [[ "a" =~ ["$chars"] ]]' && echo yes
~$ bash -c 'chars="a]"; [[ "a]" =~ ["$chars"] ]]' && echo yes
yes

I was wrong in saying that bash documentation should refer to
POSIX regexps as it disables extensions. It only  disables
extensions introduced by "\", not the ones introduced by
sequences that would otherwise be invalid in POSIX EREs like
"(?", {{, **...

It should still refer to POSIX regexps as it's the only ones
guaranteed to work. Any extension provided by the system's
regex(3) API may not work with bash.

-- 
Stephane

Re: How to match regex in bash? (any character)

2011-10-03 Thread Stephane CHAZELAS

2011-10-03, 13:48(+02), Andreas Schwab:
> Stephane CHAZELAS  writes:
>
>> The problem and confusion here comes from the fact that "\" is
>> overloaded and used by two different pieces of software (bash
>> and the system regex).
>
> That's nothing new.  The backslash is widely used as a quote character
> in several languages, which requires two levels of quoting if one of
> these languages is embedded in another one.
[...]

Yes, but in this case, contrary to zsh doesn't do two levels of
quoting. Bash quoting means to escape the RE operators, and
that's where the problem comes from. For it to work fully, bash
would need to implement the full RE parsing to know where to put
backslashes when characters are quoted.

Bash turns:

"." to \. before calling the regex(3) API
'[.]' to \[\.\] (fine)
['.'] to [\.] (not fine)
['a]'] to [a\]] (not fine)
(.)\1 to (.)1 (fine or not fine depending on how you want to
  look at it)
(?i:test} to (?i:test) (assuming regex(3) are implemented with
 PCREs: fine or not fine depending on how you want
 to look at it).

In zsh, it's simpler as quoting just quotes shell characters, it
doesn't try to escape regexp operators.

-- 
Stephane

Re: Syntax Question...

2011-10-03 Thread Stephane CHAZELAS

2011-10-3, 08:36(-04), Greg Wooledge:
> On Thu, Aug 18, 2011 at 11:10:17AM +0100, Stephane CHAZELAS wrote:
>> >> Please read BashFAQ/006: http://mywiki.wooledge.org/BashFAQ/006
>> > "no force in the universe can put NUL bytes into shell strings usefully"
>
>> No, zsh supports NUL bytes in its strings happily. It's even in
>> the default $IFS. Where NUL bytes can't go, it's in arguments
>> to commmands, environment variables, filenames... But I can't
>> see why a shell variable couldn't contain NUL bytes, it's even a
>> good thing for both those reasons as you can use that character
>> to safely separate filenames, arguments, env vars... See for
>> instance the -0 option of many GNU utilities.
>
> zsh doesn't count, because it's not compatible with any other shell.

If zsh is not compatible with any other shell, then bash isn't
either. What do you mean?

> Since this is a bash mailing list (or newsgroup), we're concerned with
> the behavior of bash, and bash can't store NUL bytes in strings.  The
> reason for this is simple enough -- bash uses native C strings for
> its variables.
[...]

I was replying to the "no force in the universe can put NUL
bytes into shell strings usefully" statement, pointout out that
at least one shell did (and is still POSIX compliant in doing
so) and it was very useful.

bash code is not fixed in stone, it could be changed to allow
NUL bytes. It could even reuse zsh code since zsh is provided
under BSD-style license.

-- 
Stephane

[OT] Re: Syntax Question...

2011-10-03 Thread Stephane CHAZELAS

2011-10-3, 09:12(-04), Greg Wooledge:
> On Mon, Oct 03, 2011 at 12:56:43PM +0000, Stephane CHAZELAS wrote:
>> If zsh is not compatible with any other shell, then bash isn't
>> either. What do you mean?
>
> http://zsh.sourceforge.net/FAQ/zshfaq02.html#l10
>
> "because of all the options it is not safe to assume a general zsh run
> by a user will behave as if sh or ksh compatible"

Not really the best place to discuss zsh but as you insist... :)

> Actually, this is a trend I wasn't previously aware of:
>
> "invoking zsh as sh or ksh (or if either is a symbolic link to zsh) sets
> appropriate options and improves compatibility (from within zsh itself,
> calling ARGV0=sh zsh will also work)"

Strange that you would say that. zsh is about the same age as
bash and has always been a few steps ahead of bash in terms of
interactive use functionality. It probably dethroned tcsh as
best interactive shells around the mid-90s (I started using it
around 97). bash and ksh93 both borrowed features from zsh. It
is more compatible with ksh than bash is (implements many of ksh
builtins like typeset, print, supports floating point
arithmetics, recursive globbing (actually, ksh borrowed that
from zsh)) 

> "The classic difference is word splitting, discussed in question 3.1;
> this catches out very many beginning zsh users. As explained there, this
> is actually a bug in every other shell."
>
> Um, no.  Word splitting is well defined, and changing how your shell
> handles it and declaring it "a bug in every other shell" is unacceptable.
[...]

The bug is the "implicit word splitting and filename generation
done upon variable expansion in list contexts".

You may not want to call it a bug, but I don't think anyone here
wouldn't agree to at least call it a "very unfortunate
misfeature responsible for 80% of shell script bugs, 90% of
exploitable shell script vulnerabilities, and a lot of
confusions in beginners"

While that feature made sense in the Bourne shell that didn't
have arrays, it's just a burden in all modern shells. That is
why several times a week you'll see comments on comp.unix.shell
such as: "don't leave a variable unquoted unless you have a very
good reason to (and know what you're doing (and taken
appropriate action to limit its side-effects))"

I'm not saying zsh is perfect, nor would I recommend it to
interpret POSIX scripts. There are misfeatures of Bourne/ksh it
didn't fix like removal of arguments that expand to an empty
string, command substitution that removes *every* trailing
newline character, some would argue it suffers a bit from
"feature creeping".

I don't write zsh script, not any more than I write bash or ksh
scripts. I don't care what shell interprets the scripts I write
(the main reason why I follow gnu.bash.bug is for the discussion
about portability) though I would rather have them interpreted
by the leaner and faster ash or pdksh derivatives.

But zsh cleaner and safer yntax, amazing completion system, and
globbing are amongst the main reasons why I wouldn't use any
other shell interactively.

If you want a shell with a very clean syntax, you can have a
look at plan9 "rc" shell (that has been ported to Unix).

-- 
Stephane

Re: Error in manual for >&word redirection

2011-10-12 Thread Stephane CHAZELAS

2011-10-12, 14:39(-06), Eric Blake:
> On 10/12/2011 02:07 PM, Greg Wooledge wrote:
>> Even using a space is not sufficient to force a valid file descriptor number
>> to be treated as a filename:
>>
>> imadev:~$ foo>&  1
>> stdout
>> stderr
>> imadev:~$ ls -l 1
>> 1 not found
>
> If you want 'word' treated as a filename, then express it as a filename. 
>   It's still possible to write to a file named '1':
>
> foo >&./1

Or

foo >&! 1

or

foo &> 1

or

foo > 1 2>&1

-- 
Stephane

Re: How to automatically load alias from .bashrc in a bash script?

2011-10-27 Thread Stephane CHAZELAS

2011-10-27, 11:17(-05), Peng Yu:
> I want to use some alias in a bash script. But I have to include the
> following line in the script in order to use the alias defined in
> ~/.bashrc. Is there a way to automatically load the alias from .bashrc
> so that I don't have to explicitly include these two lines?
>
> shopt -s expand_aliases
> . ~/.bashrc

You mean you'd like every bash script you run on your system to
automatically source your ~/.bashrc.

That sounds like a very unwise thing to do to me,  but that
could be done with:

export BASH_ENV="$HOME/.bashrc"

-- 
Stephane
>
>

Re: {varname} for redirection does not work with arrays

2011-11-05 Thread Stephane CHAZELAS

2011-11-2, 12:01(-06), unkn...@vmw-les.eng.vmware.com:
[...]
> Description:
>   If {varname} is an assoc. array in a redirection the exec will fail
>   The [] should not be confused with pathname expansion just like ${}.
>
> Repeat-By:
>   $ declare -A array
>   $ exec {array[key]}

Re: What is the best to pass an array with specially characters as command line arguments?

2011-11-07 Thread Stephane CHAZELAS

2011-11-6, 21:02(-06), Peng Yu:
[...]
> #!/usr/bin/env bash
>
> verbatim_string="  a b c ( a'b | "
>
> args="`echo \"$verbatim_string\" | sed -f quoteverb.sed`"
>
> cmd="echo $args"
> eval "$cmd"
>
> ~$ ./main.sh
> a b c ( a'b |


Either:

#! /bin/sh
verbatim_string="  a b c ( a'b | "
cmd='(set -f; IFS=" "; echo $verbatim_string)'
eval "$cmd"

($verbatim_string expanded and split at the time $cmd is
evaluated).

Or:

#! /bin/bash

verbatim_string="  a b c ( a'b | "
set -f; IFS=' '
cmd="echo $(printf '%q ' $verbatim_string)"
eval "$cmd"

($verbatim_string expanded and split and quoted initially).

(echo being a poor choice of a command for testing as it
concatenates its arguments and possibly handles "\" characters
specially under some circumstances).

By the way, does bash have any way of introducing local scope
for options, similar to zsh's

(){
  setopt localoptions noglob shwordsplit
  local IFS=' '
  args=($string)
}

Or to set the options to defaults in a local scope to be able
to write code that works whatever the context, like zsh's:

f() {
  emulate -L zsh # ensure default zsh behavior even if caller
 # has changed some settings
  ...
}

I suppose one can add

default_settings=$(set +o)

at the start of the script, and then

f() {
  local restore_settings
  restore_settings=$(set +o)
  eval "$default_settings"
  ...
  eval "$restore_settings"
}

Or is there any more canonical way to do it?

-- 
Stephane

Re: feature request: printf %(%s)T

2011-11-23 Thread Stephane CHAZELAS

2011-11-23, 12:00(-05), Chet Ramey:
> On 11/22/11 4:53 PM, Greg Wooledge wrote:
>> This is a feature request, rather than a bug.  Bash 4.2's printf command
>> has a lovely %(datefmt)T feature that allows it to print out formatted
>> timestamps using the underlying operating system's strftime(3) routine.
>> It even allows bash to print the current time, or the time the current
>> shell was invoked.
>
> I wonder if a better way to handle this is to require the %s expansion
> at configure time and use the strftime replacement in lib/sh if the C
> library's strftime doesn't implement it.  What systems, if you know, do
> not handle %s?
[...]

Or just have a special variable for that like zsh's
$EPOCHSECONDS.

Note that GNU strftime has more extensions than just %s.

See also http://stchaz.free.fr/wide_strftime 
for a POSIX shell implementation of strftime (limited to GMT
timezone and POSIX locale though).

-- 
Stephane

Re: bash silently ignores additional arguments for cd

2011-12-08 Thread Stephane CHAZELAS

2011-12-07, 12:02(-07), Eric Blake:
[...]
> Or even nicer, behave like zsh, and treat
>  cd xxx yyy
> like this (written assuming extglob syntax)
>  cd xxx+($' \t')yyy
[...]

Not sure what you're refering to, but

cd xxx yyy

in zsh means

cd ${PWD/xxx/yyy}

-- 
Stephane

Re: bash silently ignores additional arguments for cd

2011-12-08 Thread Stephane CHAZELAS

2011-12-8, 10:13(+00), Stephane CHAZELAS:
> 2011-12-07, 12:02(-07), Eric Blake:
> [...]
>> Or even nicer, behave like zsh, and treat
>>  cd xxx yyy
>> like this (written assuming extglob syntax)
>>  cd xxx+($' \t')yyy
> [...]
>
> Not sure what you're refering to, but
>
> cd xxx yyy
>
> in zsh means
>
> cd ${PWD/xxx/yyy}

And same in AT&T ksh, pdksh and their derivatives (except posh).

-- 
Stephane

Re: extglob pattern: @(/root) vs. @(root)

2011-12-09 Thread Stephane CHAZELAS

2011-12-9, 16:16(+08), Clark J. Wang:
> See following:
>
> # shopt extglob
> extglob on
> # echo $BASH_VERSION
> 4.2.20(1)-release
> # ls -d /root
> /root
> # pwd
> /
> # echo @(root)
> root
> # echo @(/root)
> @(/root)  <-- ???
> # echo @(/root*)
> @(/root*)  <-- ???
> #
>
> I'm confused why @(/root) and @(/root*) do not work here.

Globbing operators (*, ?, [/],  @(..)) don't match "/". "/" has
to be inserted literally.

See the doc:

   When a pattern is used for filename expansion, the character `.' at
the start of a filename or immediately following a slash must be
matched explicitly, unless the shell option `dotglob' is set.  When
   
matching a file name, the slash character must always be matched

explicitly.  In other cases, the `.' character is not treated specially.
~~

-- 
Stephane
>

help-bash mailing list (Was: lseek with bash)

2011-12-09 Thread Stephane CHAZELAS

2011-12-9, 10:27(-05), Greg Wooledge:
[...]
> This probably belongs on help-bash rather than bug-bash, although it's a
> grey area.  I'm Cc-ing both.
[...]

First time I hear about a "help-bash" mailing list. Is that new?
I saw no annoucement in bash bug (I could have missed it
though), nor does it seem to be referenced on the bash homepage
at gnu.org. And there doesn't seem to be a corresponding usenet
newsgroup like for the other gnu.org mailing lists. And it
doesn't seem it's been added to gmane (yet?).

Was it announced anywhere?

-- 
Stephane

Re: return values of bash scripts

2011-12-21 Thread Stephane CHAZELAS

2011-12-20, 14:18(-08), kc123:
[...]
> Could someone please help me figure out why:
> GNU bash, version 3.2.25(1)-release-(x86_64-redhat-linux-gnu)
[...]

Since you're using RedHat Linux, try

ps -o comm= -C crond

-- 
Stephane

Re: let's set the positional parameters with a simple 'read'

2012-01-03 Thread Stephane CHAZELAS

2012-01-04, 01:46(+08), jida...@jidanni.org:
>> "GW" == Greg Wooledge  writes:
>GW> Why not just use a named array?
>GW> $ read -a myarray
> But does that let me get a my favorite array, the positional parameters?

FWIW, in zsh:

~$ read -A argv
a b c
~$ echo $1
a
~$ read 1
x
~$ echo $1
x

See also

vared argv
and
vared 1

to edit them with the zsh line editor (zle, the bash equivalent
would be readline as invoked by read -e)

-- 
Stephane

Re: I think I may have found a possible dos attack vector within bash.

2012-03-20 Thread Stephane Chazelas

2012-03-20 16:47:51 +, Eamonn Smyth:
> Without sounding alarmist, I can break my machine using bash. I also have a
> fix. I shall be officially releasing the c code this weekend at the
> hackathon london.
[...]

A DOS vector often found is bash *scripts* is when a script
takes user input in a variable and that variable is left
unquoted upon expansion in list contexts.

For instance this CGI script which looks harmless at first
glance:

#! /bin/bash -
q=$QUERYSTRING
printf '%s\r\n' "Content-type: text/plain" ""
if echo $q | grep ...; then
  echo "Invalid input"
fi

Try with values of $QUERYSTRING like
/*/*/*/../../../*/*/*/../../../*/*/*/../../../*/*/*

(beware it might cause your computer to crash)

-- 
Stephane

Re: bash does filename expansion when assigning to array member in compound form

2012-08-18 Thread Stephane Chazelas

2012-08-18 10:26:22 -0500, Dan Douglas:
> This is a feature that all shells with this style of compound assignment have 
> in common. If no explicit subscripts are given, the text between the 
> parentheses is processed exactly as though it were arguments to a command 
> including brace expansion, word-splitting, and pathname expansion (and 
> consequently, quoting is just as important). This is an important feature 
> because it allows storing the results of a glob in an array easily.
> 
> If a subscript is given explicitly, then the right-hand side of the 
> assignment 
> is treated exactly as an ordinary scalar assignment would be, including all 
> analagous behaviors for `+=' and the integer attribute.
> 
>  $ set -x; a=( [1]=* )
> + a=([1]=*)
[...]

Nope:

~/1$ touch '[1]=x'
~/1$ bash -c 'a=( [1]=* ); echo "${a[@]}"'
[1]=x
~/1$ bash -c 'a=( [1]=asd ); echo "${a[@]}"'
asd

That's a bug though.

Just do

a=("*") or a=('*') or a=(\*)

-- 
Stephane

Re: Bash 4.1 doesn't behave as I think it should: arrays and the environment

2012-08-19 Thread Stephane Chazelas

2012-08-17 17:18:12 -0400, Greg Wooledge:
> On Fri, Aug 17, 2012 at 03:19:56PM +0800, John Summerfield wrote:
> > In two cases I wish to pass an array in the environment, like so:
> > 14:28 john@Boomer$ STUFF[1]=one STUFFX=stuffx env | grep ^ST
> > STUFFX=stuffx
> > STUFF[1]=one
> 
> You are creating an environment variable called "STUFF[1]".  This is
> an invalid environment variable name, but env(1) receives it and
> dutifully prints it out.  However, Bash recognizes that it is an
> invalid name, and does not turn it into a usable shell variable.
[...]

Well, if it were an invalid environment variable, why would bash
pass it to env? STUFF[1] is a valid environment variable name, but
not a valid shell variable name.

> Some older versions of Bash would also strip such invalid variables
> from the environment before invoking child processes.  Bash 4.2
> leaves them in the environment because of complaints about build
> systems that were using them for some purpose.  I don't know whether
> Bash 4.1 is one that stripped them or preserved them.  In either
> case, you should not be writing programs that rely on invalid variable
> names.
[...]

It's different here. It's the shell that *sets* the STUFF[1]
env var passed to the env command, it's not about inheriting it.

Note that no other shell would put "STUFF[1]=on" in env's
environ there:

$ zsh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF
STUFFX=stuffx
$ ksh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF
STUFFX=stuffx
$ pdksh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF
STUFFX=stuffx
$ ash -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF
ash: 1: STUFF[1]=one: not found
$ bourne-sh -c 'STUFF[1]=one STUFFX=stuffx env' | grep STUFF
bourne-sh: STUFF[1]=one: not found

~$ ksh -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF
STUFF=one
STUFFX=stuffx
~$ pdksh -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF
STUFF=one
STUFFX=stuffx
~$ bash -c 'STUFF[0]=one STUFFX=stuffx env' | grep STUFF
STUFF[0]=one
STUFFX=stuffx


(in ksh, $V is short for ${V[0]}).

-- 
Stephane

Re: bash does filename expansion when assigning to array member in compound form

2012-08-20 Thread Stephane Chazelas

2012-08-20 19:44:51 +0200, Roman Rakus:
[...]
> And how would you achieve to fill array with all file names
> containing `[1]=' for example.
[...]

Another interesting question is how to fill the array with all
the file names that start with a digit followed by "=".

$ touch {3..5}=foo
$ ls
3=foo  4=foo  5=foo
$ bash -c 'a=([0-9]=*); typeset -p a'
bash: [0-9]=*: bad array subscript
declare -a a='()'
$ bash -c 'shopt -s extglob; a=(@([0-9])=*); typeset -p a'
bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: `shopt -s extglob; a=(@([0-9])=*); typeset -p a'
$ bash -c 'shopt -s extglob
a=(@([0-9])=*); typeset -p a'
declare -a a='([0]="3=foo" [1]="4=foo" [2]="5=foo")'


> Definitely it's good, if you want to be sure, to always quote all
> characters which means pathname expansion - `*', `?' and `['.
[...]

Yes, the problem here is that "[" is overloaded in a conflicting
manner as a globbing operator and that poorly designed special
type of array assignment.

Quoting them will prevent both, it become more tricky if you
want only  one or the other.

Note that in bash that also means we need to quote variables in
there even if IFS is set to "".

$ bash -c 'a="*"; b=([1]=$a); typeset -p b'
declare -a b='([0]="[1]=bar")'

-- 
Stephane

expansion of \uXXXX in non utf8 locales

2012-12-20 Thread Stephane Chazelas

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib  
-D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Werror=format-security -Wall
uname output: Linux sc 3.2.0-4-amd64 #1 SMP Debian 3.2.32-1 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 36 (same with 39)
Release Status: release

Description:
There seems to be a bug with the expansion of \u
where  > 0xFF in non UTF8 locales

Repeat-By:

$ LC_CTYPE=C bash -c "echo -n $'\u0100'" | od -tx1
000 df df df df df df df df df df df df df df df df
020 df df df df df df df df df 19
032

$ LC_CTYPE=C bash -c "echo -n $'\u0100 '" | od -tx1

malloc: ../bash/parse.y:4515: assertion botched
free: underflow detected; mh_nbytes out of range
Aborting...000
zsh: abort  LC_CTYPE=C bash -c "echo -n $'\u0100 '" |
zsh: done   od -tx1

moving fd affect outer scope

2013-02-18 Thread Stephane Chazelas

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib  
-D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Werror=format-security -Wall
uname output: Linux sc 3.7-trunk-amd64 #1 SMP Debian 3.7.3-1~experimental.1 
x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 37
Release Status: release

Moving a file descriptor for a builtin command or a compound
command leaves the fd closed afterwards:

$ bash -c ': <&1-; echo test'
bash: line 0: echo: write error: Bad file descriptor

expected output: "test".

ksh93 had a similar bug but it was fixed in 1994:

94-06-30  A bug which caused 2<&1- when applied to a shell built-in
  to leave standard input closed has been fixed.

(don't know if it's a typo above (s/input/output/), if not it
might be a different bug)

In any case, ksh93u+ doesn't exhibit the same behavior as bash:

$ ksh -c ': <&1-; echo test'
test

The longer (standard) form works OK:

$ bash -c ': <&1 1<&-; echo test'
test

-- 
Stephane

0xFF and read -N

2013-08-31 Thread Stephane Chazelas

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib  
-D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Werror=format-security -Wall
uname output: Linux sc 3.10-2-amd64 #1 SMP Debian 3.10.5-1 (2013-08-07) x86_64 
GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 45
Release Status: release

When called with the -N option, "read" seems to consider the byte 0xFF
as meaning EOF (well, not exactly since it doesn't set the exit status to
non-zero):

OK$ printf '\376' | LC_ALL=C bash -c 'IFS= read -rN1 a; printf %s "$a"' | od 
-tx1
000 fe
001
NOK$ printf '\377' | LC_ALL=C bash -c 'IFS= read -rN1 a; printf %s "$a"' | od 
-tx1
000

NOK$ printf 'xx\377xx' | LC_ALL=C bash -c 'IFS= read -rN5 a; printf %s "$a"' | 
od -tc
000   x   x
002

The problem doesn't occur with -n instead.

-- 
Stephane

Re: Following symlinks in globstar

2014-01-25 Thread Stephane Chazelas

2014-01-21 10:19:10 -0500, Chet Ramey:
[...]
> > I am not so worried about the method used to "fix" globstar -- whether
> > we keep backwards compatibility or not -- I am more concerned that we
> > have at least *some* way of disabling what many people view as
> > undesirable behaviour.
> 
> It may or may not be undesirable -- we have no data other than your
> original message indicating that it is.  There have been no proposals,
> no code contributions, and nothing other than anecdotes to demonstrate
> that "many people" view it as a problem.
> 
> I interpret this lack of discussion as an indication of priority.  I do
> not consider this a priority, and have not to this point, and I have only
> looked at it briefly.
[...]

I agree with Chris that the bash behaviour here is not
desirable. There's a number of articles on usenet or
unix.stackexchange.com (many by me sorry,
http://unix.stackexchange.com/search?q=bash+globstar+symlink)
that say that the bash behaviour is broken in that regard and
that globstar from other shells should be used instead.

That following symlink behaviour has plagged GNU grep for years
until it was fixed in 2.12 (breaking backward compatibility),
where -r now does not follow symlinks anymore (but -R still does).

The **/ feature comes from zsh (1990), was first copied by ksh93
in 2003 (with differences), by fish in 2005, bash in 2009 and
tcsh in 2010.

bash's behaviour in many respect is closer to ksh's than zsh's.

fish's is the only other implementation that follows symlinks.

zsh's, ksh's, tcsh's don't.

zsh and tcsh both support ***/* to follow symlinks.

Utilities that recurse (chmod -R, chown -R, find, cp -r...) do
not follow symlinks (some have options to allow it).

When they do, at least they try to detect loops, bash doesn't
(fish seems to have some sort of provision for it?)

IMO, it's a dangerous misfeature, prone to introducing security
vulnerabilities.

I beleive most people using globstar are unaware of the problem,
and changing the behaviour to not follow symlinks would a lot
more likely fix those scripts that use globstar than break them.

In over 15 years of using zsh, I've been using ** daily but I
can probably count on the fingers of my hands the times I've
used *** (certainly a lot less than the times I've been bitten
by grep -r following symlinks).

Now granted, */* does follow symlinks in every shell. But
there's no recursion there, so there are a few issues which it
doesn't have.

Where bash could improve on zsh here is provide a way to avoid
following symlinks in both **/* and */*/*.

Cheers,
Stephane

Re: let's establish BASH_MINIMUM_TIME_BETWEEN_INTERACTIVE_COMMAND

2014-01-30 Thread Stephane Chazelas

2014-01-30 12:45:58 +0200, Pierre Gaston:
> On Thu, Jan 30, 2014 at 12:37 PM, Dan Jacobson  wrote:
> 
> > Thanks fellows but now bash has become very slow to the touch that way.
> >
> 
> Maybe try something like: PROMPT_COMMAND='read -t0 && sleep 10'

Or PROMPT_COMMAND="perl -MPOSIX -e 'tcflush 0,0'"
which should stop you from entering anything other than at the
prompt (flushes the terminal input buffer before displaying the
prompt).

-- 
Stephane

Re: Following symlinks in globstar

2014-03-18 Thread Stephane Chazelas

2014-02-04 09:23:21 -0500, Chet Ramey:
> On 1/25/14 6:11 PM, Stephane Chazelas wrote:
> > 2014-01-21 10:19:10 -0500, Chet Ramey:
> > [...]
> >>> I am not so worried about the method used to "fix" globstar -- whether
> >>> we keep backwards compatibility or not -- I am more concerned that we
> >>> have at least *some* way of disabling what many people view as
> >>> undesirable behaviour.
>   [...]
> > I agree with Chris that the bash behaviour here is not
> > desirable. There's a number of articles on usenet or
> > unix.stackexchange.com (many by me sorry,
> > http://unix.stackexchange.com/search?q=bash+globstar+symlink)
> > that say that the bash behaviour is broken in that regard and
> > that globstar from other shells should be used instead.
> 
> I put in a tentative change; we will see how it works.
[...]

That looks a lot better indeed. Thanks Chet.

regards,
Stephane

regression: extglobs expand hidden files in 4.3

2014-03-18 Thread Stephane Chazelas

With bash-4.3 as found on Debian's 4.3-2 package:

$ bash -cO extglob 'echo *(.)'
. ..
$ bash -cO extglob 'echo *(foo)*'
. .. a
$ bash -cO extglob 'echo @(|foo)*'
. .. a


It looks like the regression was introduced by 4.3, as 4.2 doesn't exhibit the
same problem.

This one's OK:

$ bash -cO extglob 'echo ?(foo)*'
a

regards,
Stephane

Re: ctrl-c does not send INT to all processes under foreground job

2014-03-19 Thread Stephane Chazelas

2014-03-19 04:07:06 +, Ryan Ruan:
[...]
> It is said that "ctrl-C sends INT to ALL processes under foreground job", but 
> i found a weird phenomenon.
> I have not read source code yet, but it does not seem to ascribe to what 
> specification says.
> 
> Test code is like:
> 
> 1  trap "echo hello world $1" 2
> 2  sleep 1
> 3  /bin/bash $0 $(( $1 + 1 ))
> 4  echo "$1 go to sleep"
> 5  sleep 1000
> 6  echo "$1 exit "
> 
> 
> When I run ./test.sh on the console, the process (/bin/bash test.sh) is stuck 
> at line 2.
> Then I input Ctrl-C, the result is that the code is interrupted at line 2 and 
> goes to execute line 3, thus generate a new process (/bin/bash test.sh 1).
> At the same time, the first process (/bin/bash test.sh) is stuck at line 3 
> waiting for process '/bin/bash test.sh 1' to finish.
> At this time, I input Ctrl-C again, and generate process '/bin/bash test.sh 
> 2', and i think process '/bin/bash test.sh 1' SHOULD BE SENT  signal INT now. 
> So I kill process '/bin/bash test.sh 2' by doing 'kill -9 '.
> What amazes me is that process '/bin/bash test.sh 1' did not trap INT this 
> time, because "hello world 1" is not printed out. So it seems the process did 
> not receive INT before.
> How can this be? Is it a bug?
[...]

I can't reproduce with any of bashs 3.2.48, 4.0, 4.1, 4.2, 4.3
on Linux 3.12 amd64 which exhibits the same behavior as any
other shell I tried. What system are you on? Have you tried
other shells?

Do you possible have a SHELLOPTS, BASHOPTS or BASH_ENV in your
environment?

What should happen is that on the first ^C

zsh
  └─bash ./a
  └─sleep 1

zsh being my interactive shell has set the process group running
the script in foreground, so all processes in that group should
receive the SIGINT.

bash blocks the SIGINT to be able to handle it *after* sleep
returns.

sleep dies because of the SIGINT, then bash receives and handles
the SIGINT so we see "hello world", and it goes on to run the
next command:

zsh
  └─bash ./a
  └─bash ./a 1
  └─sleep 1

Now, three processes in that process group. ^C is sent to all
of them but the bash processes block it.

Upon termination of the sleep, "hello world 1" is printed but
the first bash doesn't print its "hello world" because the
second bash has not returned. And we now have:

zsh
  └─bash ./a
  └─bash ./a 1
  └─bash ./a 2
  └─sleep 1

If you kill the "./bash ./a 2", "bash ./a 1", runs its
echo "go to sleep 1" and after finishing, we should see the
first bash handling its second SIGINT at last and display 
"hello world" again before going to sleep.

-- 
Stephane.

Re: ctrl-c does not send INT to all processes under foreground job

2014-03-19 Thread Stephane Chazelas

2014-03-19 09:51:06 -0400, Chet Ramey:
[...]
> This is another instance of the question "what does bash do with a SIGINT
> it receives while waiting for a foreground process?"  The answer is that it
> allows the child to decide what to do and responds accordingly: if the
> child exits due to being killed by the SIGINT, bash acts as if it received
> the SIGINT; if the child decides to trap it, do something else and not
> resend the SIGINT to itself, bash will effectively discard it.  In any
> case, bash doesn't do anything until the child exits.
> 
> There is a fairly long explanation of the issue at
> http://www.cons.org/cracauer/sigint.html.
[...]

But that's about the

bash -c 'sh -c "trap : INT; sleep 10"; echo seen; sleep 10'

where bash and AT&T ksh (contrary to all other shells I tried)
let you see "seen" above if you press Ctrl-C during the first
sleep.

But in the OP's case, we've got a "trap" in the outer shell.

bash -c 'trap "echo ouch" INT;
  sh -c "trap : INT; sleep 10"; echo seen; sleep 10'

Where you see "ouch" and "seen" in all shells (regardless of
whether the inner sh traps the SIGINT or not). So it looks like
a different issue.

minor: "read line" may read several lines if stdin is a tty

2014-08-25 Thread Stephane Chazelas

If you run (here testing on Linux):

  bash -c 'read a; echo "<$a>"; tr b c'

And enter aaabbb

You see "", but not "ccc". That's because "read" reads up
to 128 bytes of data in one read(2) invocation instead of
reading one byte at a time like on other types on non-seekable
files.

Probably not a big deal as one is unlikely to type
. On the other end, when input is from the
terminal, there's not much point optimising so you might as well
read one byte at a time.

See also:

bash -c 'read a; echo "1: $a";  head -n 1; read b; echo "3: $b"'

If typing abc, you see:

1: a
c
3: b

Instead of:

1: a
b
3: c

It's probably a bigger concern though if reading from a serial
device (/dev/ttyS0) in raw mode for instance.

-- 
Stephane

[doc] confusion over $0 and positional parameters

2014-08-27 Thread Stephane Chazelas

In

  info -f bash -n 'Invoking Bash'

we find:

> `-c'
>  Read and execute commands from the first non-option ARGUMENT after
>  processing the options, then exit.  Any remaining arguments are
>  assigned to the positional parameters, starting with `$0'.

$0 is generally *not* considered as a positional parameter. Both
POSIX
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_01)
and "info -f bash -n 'Positional Parameters'" have:

> A positional parameter is a parameter denoted by the decimal
> value represented by one or more digits, other than the single
> digit 0

explicitetly excluding $0.

That seems to be causing some confusion
https://unix.stackexchange.com/questions/152391

Maybe a better wording would be:

> `-c'
>  Read and execute commands from the first non-option ARGUMENT after
>  processing the options, then exit.  The first argument
>  after that is assigned to $0 which is used in error
>  messages for instance (in the abscence of arguments, $0
>  is assigned to the argv[0] that the interpreter received
>  (usually bash or sh)), and the following arguments are
>  assigned to the positional parameters.

The note of $0 being used for error messages would also be
useful as we see people using a place-holder like _ or -- there
causing more confusion. See for instance:

http://mywiki.wooledge.org/BashFAQ/012

-- 
Stephane

Re: minor: "read line" may read several lines if stdin is a tty

2014-08-27 Thread Stephane Chazelas

2014-08-27 08:10:01 -0400, Chet Ramey:
> On 8/25/14, 4:24 PM, Stephane Chazelas wrote:
> > If you run (here testing on Linux):
> > 
> >   bash -c 'read a; echo "<$a>"; tr b c'
> > 
> > And enter aaabbb
> > 
> > You see "", but not "ccc". That's because "read" reads up
> > to 128 bytes of data in one read(2) invocation instead of
> > reading one byte at a time like on other types on non-seekable
> > files.
> 
> While this is true, it's not complete.  Bash relies on the terminal
> driver returning newline-delimited lines when reading from the
> terminal.  If you try to fool it, you can probably succeed.
[...]

I agree  is clearly an attempt at fooling it and
it's reasonable to expect a read on a terminal in canonical mode
to return a single line.

However, one could imagine using bash's read to get data off a
tty device not in canonical mode (a serial device used as just a
serial device), or any other non-terminal character device for
that matter (a "strace bash -c 'read < /dev/urandom'" reveals
there's an issue there as well), and that would be more of a
problem.

Again, not a big one (hence the [minor] tag) but still a
non-POSIX conformance and bash seems to be the only shell with
the problem (though ksh93 has its own bugs as usual).

-- 
Stephane

Re: minor: "read line" may read several lines if stdin is a tty

2014-08-29 Thread Stephane Chazelas

2014-08-27 15:35:06 -0400, Chet Ramey:
> On 8/27/14, 3:20 PM, Stephane Chazelas wrote:
> 
> > However, one could imagine using bash's read to get data off a
> > tty device not in canonical mode (a serial device used as just a
> > serial device), or any other non-terminal character device for
> > that matter (a "strace bash -c 'read < /dev/urandom'" reveals
> > there's an issue there as well), and that would be more of a
> > problem.
> > 
> > Again, not a big one (hence the [minor] tag) but still a
> > non-POSIX conformance and bash seems to be the only shell with
> > the problem (though ksh93 has its own bugs as usual).
> 
> Well, again you have to rely on the system telling you what's happening.
> Bash tests whether a file descriptor is seekable using, logically
> enough, lseek.  If it doesn't return -1/ESPIPE, bash assumes the fd is
> seekable.  If it does, bash does unbuffered reads.
[...]

OK, sorry, I hadn't realised /dev/random was seekable (though
the seek is a no-op) on Linux.

> (If isatty() returns true, bash does assume that read(2) will return
> newline-delimited records.)
[...]

That assumption is only valid (to some extent, because of lnext)
if the terminal is in icanon mode though.

Cheers,
Stephane

Re: make 'tr' (or something like it) a bash builtin ?

2006-02-13 Thread Stephane Chazelas

On Sun, Feb 05, 2006 at 07:02:10PM -0500, Chris F.A. Johnson wrote:
> On Sun, 5 Feb 2006, Felipe Kellermann wrote:
> 
> >On Thu, 2 Feb 2006 11:13pm  -0500, Mike Frysinger wrote:
> >
> >>upper case or lower case ... if 'tr' was a bash builtin, then that'd work
> >>with some `echo | tr` magic, but it something like
> >>${foo//[[:lower:]]/[[:upper:]]} would be nice
> >>-mike
> >
> >typeset -u (ksh, zsh) supported in bash would be nice...
> >
> > orchid% typeset -u fuu
> 
>So long as it uses declare, not typeset :)
> 
> > orchid% fuu=bar
> > orchid% echo $fuu
> > BAR
> 
>I'd prefer variable expansion, perhaps the way Mike suggested, or,
>e.g.:
> 
> $ foo=bar
> $ echo ${foo^}  ## Convert first character
> Bar
> $ echo ${foo^^}  ## Convert all characters
> BAR
> $ echo ${foo^[a-m]} ## Convert first character that matches pattern
> Bar
> $ echo ${foo^^[a-m]} ## Convert all characters that match pattern
> BAr
[...]

Also note zsh parameter expansion flags that can be easily
combined:

$ foo=bar
$ echo ${(U)foo}
BAR
$ echo ${(Ul:20::.:)foo}
.BAR
$ echo ${(Uqq)foo}
'BAR'

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: bash shell parser bug

2006-03-20 Thread Stephane Chazelas

On Thu, Mar 16, 2006 at 08:28:28PM -0800, laura fairhead wrote:
> 
> 
> Hello,

Hi Laura,

> I just found a bug that affects a number of shells (pressumably the
> code there is from the same roots) in the parser.
> 
> The following code;
> 
> l='eval "$l"'
> eval "$l"
> 
> Which sets off an infinite recursion on 'eval', should result in an 
> infinite loop to be terminated by INT (doesnt' work) or at least
> end gracefully with an error "bash: out of memory". Instead the
> system has to kill the shell process because of SEGV fault.
[...]

It's the same with

f() { f; }; f

actually.

~$ bash -c 'f() { f; }; f'
zsh: segmentation fault  bash -c 'f() { f; }; f'
(139)~$ zsh -c 'f() { f; }; f'
zsh: segmentation fault  zsh -c 'f() { f; }; f'
(139)~$ ksh -c 'f() { f; }; f'
zsh: segmentation fault  ksh -c 'f() { f; }; f'
(139)~$ ash -c 'f() { f; }; f'
zsh: segmentation fault  ash -c 'f() { f; }; f'
(139)~$ gawk 'function f() { f() } BEGIN { f() }'
zsh: segmentation fault  awk 'function f() { f() } BEGIN { f() }'
(139)~$ perl -le 'sub f { f(); }; f'
zsh: killed perl -le 'sub f { f(); }; f'
(137)~$

For perl, it was probably the OOM killer, my Xserver was killed as well.


TCL:

~$ expect -c 'proc f {} { f }; f' 2>&1 | tail
"f "
(procedure "f" line 1)
invoked from within
"f "
(procedure "f" line 1)
invoked from within
"f "
(procedure "f" line 1)
invoked from within
"f"

The recursion limit seems to be 1000, there.

Same for:

expect -c 'set f {eval $f}; eval $f'

Same thing in python (limit 1000 as well) and ruby (limit seems higher).

~$ perl -le '$f = "eval \$f"; eval $f'

is still running but is slowly grabbing all the memory.

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: check_dev_tty - What does it do?

2006-06-23 Thread Stephane Chazelas

On Fri, Jun 23, 2006 at 02:54:04PM +1200, John Carter wrote:
> 
> Here is the bash-3.1 code from general.c for check_dev_tty.
> 
> It is invoked shortly after main...
> 
> void
> check_dev_tty ()
> {
>   int tty_fd;
>   char *tty;
> 
>   tty_fd = open ("/dev/tty", O_RDWR|O_NONBLOCK);
> 
>   if (tty_fd < 0)
> {
>   tty = (char *)ttyname (fileno (stdin));
>   if (tty == 0)
>   return;
>   tty_fd = open (tty, O_RDWR|O_NONBLOCK);
> }
>   close (tty_fd);
> }
> 
> It seems to open and close /dev/tty and return nothing.
> 
> This seems very strange to me.
> 
> Why is it doing this?
[...]

A rough guess:

It may be that, if bash is the session leader and no session is
attached to the terminal open on stdin, the open would attach
the terminal to the session. Bash would become the controlling
process.

I guess it might happen when you boot Linux with init=/bin/bash
for instance.

It may not be the reason for it, though. And I'm not sure it is
right for bash to do so.

-- 
Stephane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: PATH and $0

2006-07-12 Thread Stephane Chazelas

On Tue, Jul 11, 2006 at 08:19:34PM -0400, Dave Rutherford wrote:
> On 7/11/06, Cai Qian <[EMAIL PROTECTED]> wrote:
> >I want its full pathname using 'dirname', but it will give me
> >unexpected result on some Linux or Bash versions.
> 
> Well, 'dirname' certainly won't do what you want, but I'm sorry,
> I can't think of a way to get what you need.  (It would be relatively
> easy in 'c'.)  Even /proc/self/* doesn't contain the script's full
> pathname.  Perhaps somebody else knows a better way.
[...]

$0 will always  contain the file path, unless the script was
started as:

bash script.sh

And there's no script.sh in the current directory (in which case
sh/bash will have looked up script.sh in $PATH).

So:

#! /bin/sh -
dir=$(
  cmd=$0
  [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit
  dir=$(dirname -- "$cmd")
  cd -P -- "$dir" && pwd -P
) || exit
# untested

should give you the absolute path of the directory portion of
the script path (unless that directory ends in newline
characters).

-- 
Stephane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: PATH and $0

2006-08-02 Thread Stephane Chazelas

On Thu, Jul 13, 2006 at 11:53:00PM -0600, Bob Proulx wrote:
> Stephane Chazelas wrote:
> > $0 will always  contain the file path, unless the script was
> > started as:
> >
> > bash script.sh
> >
> > And there's no script.sh in the current directory (in which case
> > sh/bash will have looked up script.sh in $PATH).
> 
> Of course using command as you have done will work for bash.

"command" is a POSIX command. So will work with every POSIX and
therefore every Unix conformant sh. Note that on some systems,
the Unix conformant sh is not in /bin, you may need to adapt the
she-bang line (note that the she-bang line is a non-standard Unix
feature).

> But I
> always feel better about using portable shell as much as possible.
> here is one way.
> 
>   pathfind() {
> OLDIFS="$IFS"
> IFS=:
> for p in $PATH; do
>   if [ -x "$p/$*" ]; then
> IFS="$OLDIFS"
> echo "$p/$*"
>   fi
> done
> IFS="$OLDIFS"
>   }

On the contrary the code above has many problems:

1- If IFS was unset before, it becomes set to the empty string
after which has a different meaning.

2- word splitting when IFS contains non-blank characters varies
from one shell to the next, and in most implementations
(including bash and AT&T ksh, it differs from the way the shell
splits $PATH internally to look up a command ("/bin:" is "/bin"
and "" as far as PATH look up is concerned while bash world
splitting splits "/bin:" into only "/bin").

3- you forgot to disable filename generation, which means
wildcards will be expanded in "for p in $PATH"

4- there's a "break" (or return 0) missing, a "return 1" missing
if there's no match.

5- echo is not a portable command, use printf instead as POSIX
suggests.

> 
> > #! /bin/sh -
> > dir=$(
> >   cmd=$0
> >   [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit
> >   dir=$(dirname -- "$cmd")
> >   cd -P -- "$dir" && pwd -P
> > ) || exit
> > # untested
> >
> > should give you the absolute path of the directory portion of
> > the script path (unless that directory ends in newline
> > characters).
> 
> One thing to note about this script is that it canonicalizes path with
> respect to symlinks.  You do say that but without much
> fanfare.

That was intended. You can probably find as many caveats with
one approach or the other.

Note that if the binary file is a symlink itself, it doesn't
canonalise it.

[...]
> I think it is best to accept whatever path the user has provided
> verbatim.  I would not try to canonicalize the path in any way.  If
> the user has provided a path with /foo/bar/../../zoo/zam for example
> then I would just go with it because that path may actually be
> necessary just like that for some reason as yet unknown at this moment
> to us writing the code to be used in the future.  The reason I think
> this way is that I have tried to do things like canonicalizing
> previously myself and gotten burned by it.  I have learned it is
> better to avoid doing such things.

Then:

dir=$(
  cmd=$0
  [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit
  dir=$(dirname -- "$cmd")
  case $dir in
/*) ;;
*) dir=$(pwd -P)/$dir || exit;;
  esac
  printf '%s\n' "$dir"
) || exit

-- 
Stephane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: PATH and $0

2006-08-14 Thread Stephane Chazelas

On Sat, Aug 12, 2006 at 01:51:12AM -0600, Bob Proulx wrote:
> Stephane Chazelas wrote:
> >   [ -e "$cmd" ] || cmd=$(command -v -- "$cmd") || exit
> > Bob Proulx wrote:
> > > Of course using command as you have done will work for bash.
> > 
> > "command" is a POSIX command. So will work with every POSIX and
> > therefore every Unix conformant sh.
> 
> That is not strictly true.  While 'command' is POSIX 'command -v'
> is actually an optional extension.  A system only claiming POSIX
> conformance need not implement 'command -v'.  See the online standards
> documents for more information.
> 
>   http://www.opengroup.org/onlinepubs/009695399/utilities/command.html

You're right. command -v is not required by POSIX, but required
by Unix (XSI). A lot of shells (like bash) are only POSIX
conformant.

> And in fact the FreeBSD /bin/sh (among others) implements 'command'
> but not 'command -v'.  So in practice it is not as portable as I would
> like it to be.

According to the man page, it does at least in 6.1-RELEASE.

See:

http://www.freebsd.org/cgi/man.cgi?query=sh&sektion=1&apropos=0&manpath=FreeBSD+6.1-RELEASE

[...]
> > 1- If IFS was unset before, it becomes set to the empty string
> > after which has a different meaning.
> 
> Your posted script did not unset IFS.  Therefore that is not a concern
> there.  I would not prefer 'command -v' over this because someone
> might unset IFS.  There are lots of worse things that are possible.

I was speaking of the general case:

oIFS=$IFS
...
IFS=$oIFS

doesn't restore IFS in every case.

> > 2- word splitting when IFS contains non-blank characters varies
> > from one shell to the next, and in most implementations
> > (including bash and AT&T ksh, it differs from the way the shell
> > splits $PATH internally to look up a command ("/bin:" is "/bin"
> > and "" as far as PATH look up is concerned
> 
> Negative.  In AT&T ksh (and others) "/bin:" is "/bin" and "." for PATH
> lookup.


No, it's "/bin" and "". the shell tries to exec "/bin/foo" then
"foo", not "/bin/foo" and "./foo". But any what I meant was
"/bin:" means /bin and the current directory for PATH lookup,
but not for bash word splitting.

> > while bash world splitting splits "/bin:" into only "/bin").
> 
> Negative.  In bash "/bin:" is "/bin" and "." for PATH lookup.

I was speaking of "bash word splitting" not PATH lookup, that
was my whole point: you can't use word splitting that way
because it doesn't match the way the shell does the PATH lookup.

> 
> > 3- you forgot to disable filename generation, which means
> > wildcards will be expanded in "for p in $PATH"
> 
> Negative.  Wildcards will not be expanded there.

Yes, they will:

Try:

foo='a:*:b'
IFS=:
for i in $foo; do echo "$i"; done


regards,
Stephane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: ...Limitation?

2006-09-26 Thread Stephane Chazelas

On Tue, Sep 26, 2006 at 11:45:42AM -0400, Paul Jarc wrote:
> mwoehlke <[EMAIL PROTECTED]> wrote:
> > I am trying to figure out how to run a command and pipe the output 
> > through tee, and then check the status of the original command.
> 
> This uses a bash-specific feature:
> cmd > >(tee file); status=$?

This is a ksh feature that can also be found in zsh.

But it's true it's not standard. I may be wrong but I don't
think bash has any significant feature of its own.

What is bash specific is the $PIPESTATUS array. zsh also has it
but it's called $pipestatus there as zsh arrays are
traditionally lowercase (so that they are not confused with
scalar variables (all bash variables are both array and
scalar at the same time as in ksh)).

cmd | tee file

and cmd exit status can be found in ${PIPESTATUS[0]}.

> This should work on any sh:
> exec 3>&1 && status=`exec 4>&1 && { cmd; echo $? >&4; } | tee file >&3`

You may want to write it:

exec 3>&1 && status=`exec 4>&1 && { cmd 4>&-; echo $? >&4; } | tee file >&3`

because otherwise, if cmd spawns a process, and that process
doesn't close its fd 4, you'd have to wait for it to finish (you
may have the same problem if it doesn't close its fd 1 because
of the pipe to tee, though).

> Or, if you don't want to clobber any descriptors, in case they might
> be in use for something else:
[...]

Then do it in a subshell (unless cmd makes use of those file
descriptors) or do it this way instead of using exec (bash will
take care of restoring the fds):

{
  status=$(
{
  {
cmd 4>&-
echo "$?" >&4
  } 3>&- |
tee file >&3 3>&-
} 4>&1
  )
} 3>&1

This should work in any Unix or POSIX conformant sh, not in the
Bourne shell.

-- 
Stéphane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: Fatal bug with redirection

2006-10-04 Thread Stephane Chazelas

On Wed, Oct 04, 2006 at 03:38:12PM -0400, Yang wrote:
> On my FC5 box, when I try certain things in bash 3.1.7(1)-release
> involving redirection, such as the following, bash crashes
> (immediately exits):
> 
> { echo a >&3 ; } 3> >( echo b )
> 
> I've only tried this from an interactive shell. Unfortunately I don't
> have bashbug or bashversion, as I'm using the FC5 RPM version, but
> here is some info:
[...]

All the shells I could try that where echo is builtin behave the
same in that case (die).

Above,

echo is writing to a pipe whose other end is closed (echo b has
terminated already by the time "echo a" starts to write on its
standard output), so echo will get a SIGPIPE signal. The problem
is that, echo being builtin, it's the shell that is doing the
write to the dead pipe, and so it's the shell that is getting
the SIGPIPE. SIGPIPE is a fatal signal.

However, it's true that a builtin echo is not meant to behave
differently than an external echo command. So bash (and other
shells) should try and work around that by only causing the echo
command to terminate with a 141 exit status.

The problem is not only for echo.

test \( ! 2> >(:)

read % 2> >(:)

-- 
Stéphane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: Curly braces expansion not always works as expected.

2006-10-06 Thread Stephane Chazelas

On Thu, Oct 05, 2006 at 12:23:41AM +0200, Reuti wrote:
[...]
> Curly braces expansion only works with at least one comma  
> (,). This also works, if the appears at least once
> inside some apostrophes and has no function.
> 
> Repeat-By:
> [EMAIL PROTECTED]:~> bash --version
> GNU bash, version 2.05b.0(1)-release (powerpc-apple-darwin8.0)
> Copyright (C) 2002 Free Software Foundation, Inc.
> [EMAIL PROTECTED]:~> echo {"x x"} # No {} in output expected.
> {x x}
> [EMAIL PROTECTED]:~> echo {"xx"} # No {} in output expected.
> {xx}
> [EMAIL PROTECTED]:~> echo {"xx","yy"} # Works fine.
> xx yy
> [EMAIL PROTECTED]:~> echo {"x,x"} # Add a comma to get it  
> working, i.e. remove the {}.
> x,x
> [EMAIL PROTECTED]:~>
> 
> Also this seems to be strange:
> 
> [EMAIL PROTECTED]:~> echo {"klklkl"} {1,2,3} # Only second {}  
> works correctly.
> {klklkl} 1 2 3
> [EMAIL PROTECTED]:~> echo {"klklkl"}{1,2,3} # Completely messed up.
> {klklkl}{1,2,3}

Only that one is a bug, it should output {klklkl}1 {klklkl}2
{klklkl}3

The other ones work as documented.

From info -f bash -n 'Brace Expansion'

 | A correctly-formed brace expansion must contain unquoted
 | opening and closing braces, and at least one unquoted comma
 | or a valid sequence expression.  Any incorrectly formed brace
 | expansion is left unchanged.

> [EMAIL PROTECTED]:~> echo {"kl,klkl"}{1,2,3} # Add a comma to  
> get it working
> kl,klkl1 kl,klkl2 kl,klkl3
> [EMAIL PROTECTED]:~>
[...]

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: case statement breaks $( ) substitution

2006-10-06 Thread Stephane Chazelas

On Fri, Oct 06, 2006 at 11:55:27AM +0200, [EMAIL PROTECTED] wrote:
[...]
> Repeat-By:
> 
> $ echo $(
> >   case a in
> > 1) :;;
> bash: syntax error near unexpected token `;;'
> 
> The closing `)' is misinterpreted as being the end of the `$('
> substitution, instead of being recognized as element of the
> case block which it is part of.
[...]

It's a known limitation also in zsh and some versions of ksh.

Use

$(
  case a in
(1) ;;
  esac
)

instead.

-- 
Stephane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: How to detect bash?

2006-10-10 Thread Stephane Chazelas

On Tue, Oct 10, 2006 at 05:12:07PM -0500, mwoehlke wrote:
> Dave Rutherford wrote:
> >On 10/10/06, mwoehlke <[EMAIL PROTECTED]> wrote:
> >>Anyone have any clever, VERY reliable tricks for detecting if the
> >>current shell is bash?
> >
> >Well, I don't know if it's clever, but how about:
> 
> Oh, my... Where do I *start*?
> 
> >$ if [ "${SHELL//*/bash}" = "bash" ]; then echo y; fi
> 
> $ echo $SHELL
> /bin/sh
> $ echo $BASH
> /bin/bash
> $ foo
> bash: foo: command not found
> 
> There is *ABSOLUTELY* no guarantee that $SHELL correctly points to bash, 
> or that $SHELL is even remotely correct for that matter. This is /worse/ 
> than relying on $BASH.
> 
> But it does bring up an interesting possibility:
> [ "`/dev/null 2>&1`" = "bash: /dev/null: Permission denied" ]
[...]

You quest looks a bit pointless to me. What prevents the user to
edit a copy of your script to remove the check anyway?

$ zsh -c 'echo "`/dev/null 2>&1`"' bash
bash: /dev/null: Permission denied

$ zsh
$ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH'
bash: /dev/null: Permission denied

$ echo '/dev/null(){echo "bash: /dev/null: Permission denied"}' \
>   >> ~/.zshenv
$ zsh -c 'echo "`/dev/null 2>&1`"'
bash: /dev/null: Permission denied

And whatever check you do can be worked around in a way or
another.

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: How to detect bash?

2006-10-11 Thread Stephane Chazelas

On Tue, Oct 10, 2006 at 06:14:22PM -0500, mwoehlke wrote:
[...]
> >$ zsh -c 'echo "`/dev/null 2>&1`"' bash
> >bash: /dev/null: Permission denied
> >
> >$ zsh
> >$ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH'
> >bash: /dev/null: Permission denied
> 
> Eh? I get:
> 
> $ zsh -c 'echo "`/dev/null 2>&1`"' bash
> zsh:1: permission denied: /dev/null

Well, I do get what I said with zsh 4.3.2

> $ ARGV0=bash ash -c 'echo "`/dev/null 2>&1`"; echo $BASH'
> /dev/null: permission denied

(note that this command must be run from zsh which uses
ARGV0=... to set ash's argv[0]).

You must have an older /ash/ than I have mine is one of the
OpenBSD based sh ones (found on debian).

> 
> So neither of your counter-examples is working for me (although both 
> look like they *should*; go figure).

Though I'd bet the third one with .zshenv worked.

> But since you didn't counter 
> BASH_SUBSHELL (and since I'm too lazy to change it now) I guess I'll 
> stick with that. :-)

BASH_SUBSHELL is a relatively recent addition to bash. Most
systems still don't have a bash3.

$ bash -c 'echo "$BASH_VERSION, <$BASH_SUBSHELL>"'
2.05b.0(2)-release, <>

[...]
> True, but the main point of the exercise is to go with a check that's 
> unlikely to be worked around "by accident". If someone intentionally 
> circumvents the check (and you're right, editing the script would be 
> easy), well then they deserve whatever happens. But I *am* paranoid 
> enough to not trust that $BASH is never set - other than by bash - for 
> some reason. Or that it hasn't been *unset* (since that seems to kill it 
> forever), because we have 'clean environment' scripts that would do this 
> sort of thing.
[...]

Then I'd go with $BASH_VERSION as someone else suggested which
is in every bash version.

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: ${//}: say how to non-destructively prefix/suffix

2006-11-05 Thread Stephane Chazelas

On Mon, Nov 06, 2006 at 02:54:25AM +0800, Dan Jacobson wrote:
> Regarding the
>${parameter/pattern/string}
>${parameter//pattern/string}
> discussion on the man page:
> You don't mention what if pattern is null.
> Seems that will never match.
> Anyway, no way to
> $ set aa bb cc
> $ echo [EMAIL PROTECTED]
> so that one gets all parameters to have a string appended,
> e.g. aa.t bb.t cc.t
> One must sacrifice at least one character in "pattern". No way to
> destroy less than one character.
[...]

echo "${@/%/.t}"

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: strange expansion of <(cat p.main.{optional,extra})

2007-04-18 Thread Stephane Chazelas

On Tue, Apr 17, 2007 at 08:16:53PM -0400, Chet Ramey wrote:
[...]
> > The expansion of echo <(cat /etc/{passwd,motd}) is rather surprising:
> > $  echo <(cat p.main.{optional,extra})
> > ++ cat p.main.optional
> > cat: p.main.optional: No such file or directory
> > ++ cat p.main.extra
> > cat: p.main.extra: No such file or directory
> > + echo /dev/fd/63 /dev/fd/62
> > /dev/fd/63 /dev/fd/62
> > 
> > I would have expected to be expanded to <(cat /etc/passwd /etc/motd) first.
> 
> That's not how it works.  Brace expansion is the first expansion performed.
> The manual page says:
> 
> "Brace expansion is performed before any other expansions, and any char-
> acters special to other expansions are preserved in the result.  It  is
> strictly  textual.  Bash does not apply any syntactic interpretation to
> the context of the expansion or the text between the braces."
[...]

How does it get exanded to

echo <(cat p.main.optional) <(cat p.main,extra)

though?

If it were textual, I would expect:

echo <(cat p.main.optional) p.main,extra)

And what about:

echo $(cat p.main.{optional,extra})

This resolves to

echo $(cat p.main.optional p.main.extra)

I can't see any reason why $(...) and <(...) should be treated
differently.

bash-3.1$ echo <(echo {a,b})
++ echo a
++ echo b
+ echo /dev/fd/63 /dev/fd/62
/dev/fd/63 /dev/fd/62
bash-3.1$ echo $(echo {a,b})
++ echo a b
+ echo a b
a b


All the other shells treat them the same in that regards.

Any expansion inside command or process subsitution should
be defered to the subshell I think. 

% echo <(echo {a,b})
+zsh:1> echo a b
+zsh:1> echo /proc/self/fd/11
/proc/self/fd/11
% echo $(echo {a,b})
+zsh:1> echo a b
+zsh:2> echo a b
a b

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: Bash is broken

2007-05-13 Thread Stephane Chazelas

On Sun, May 13, 2007 at 12:31:48AM -0700, Overdorf, Sam wrote:
> The following does not work on my computers:
[...]
> If [ "1" > "2" ]
[...]
> Bash thinks this is I/O redirection and creates a file by the name of
> "2" in my directory.
> 
> This should be a string compare.
[...]

Hi. No, this shouldn't.

"[" is a command just as every other one. And on a command line,
the redirection operators can be put anywhere.

echo 1 > 2 ]

is the same as

> 2 echo 1 ]

or

echo 1 ] > 2

If you want, you can use the [[ ... ]] syntax instead. [[ is not
a command, [[ ... ]] as a whole is a command by itself if you
want and what's inside is treated as a text expression, not as
arguments and redirections to a command.

[[ 1 > 2 ]]

If you want to use "[" (aka test), you need to make sure ">" is
passed as an argument to "[".

[ 1 '>' 2 ]

or [ 1 \> 2 ]

[[ ... ]] is not a standard shell feature, but the > test
operator is not either.

Portably (as in Unix shell scripts), you'd use expr:

if expr "x$1" \> "x$2" > /dev/null

or awk.

if awk 'BEGIN{exit !(ARGV[1]"" > ARGV[2]"")}' "$1" "$2"

Finally, it should be noted that string comparison behavior
depends on the current locale. So if you don't want your script
behavior to depend on who is running it, you may want to fix the
locale:

LC_ALL=C expr ...

-- 
Stéphane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: Bash is broken

2007-05-13 Thread Stephane Chazelas

On Sun, May 13, 2007 at 08:19:13AM -0600, Eric Blake wrote:
[...]
> if [[ 1 > 2 ]]
> 
> > This should be a string compare.
> 
> Actually, when quoted properly, it should be a numeric comparison, not a
> string comparison.
[...]

It *is* a string comparison:

$ bash -c '[[ 02 > 1 ]]' || echo "02 is not greater than 1"
02 is not greater than 1

[[ a > b ]], [ a '>' b ] are string comparisons.

[ a -gt b ], [[ a -gt b ]], (( a > b )) are numeric comparisons.

expr's > is both depending on whether the operands are
recognised as integers or not.

in awk, it depends on the type of the arguments.

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: wrong logical evaluation of expressions involving true or false commands

2007-06-25 Thread Stephane Chazelas

On Mon, Jun 25, 2007 at 02:07:56PM -0600, Bob Proulx wrote:
[...]
> following.  These are the same.
> 
>   [ foo -a bar ]
>   [ -n foo -a -n bar ]
> 
> This is defined by standard by the number of arguments.
> 
>   http://www.opengroup.org/onlinepubs/009695399/utilities/test.html
[...]

And it should be noted (and it's noted as well in the page
you're refering to) that while the above is true for strings
such as "foo" and "bar", it is not true in the general case.

[ -n "$foo" -a -n "$bar" ]

is not the expression to test whether both "$foo" and "$bar" are
non-empty, as it would fail for some specific values of $foo or
$bar (try it when $foo contains "=" for instance).

That's why it's useful to remember those rules:

- don't call "[" with more than 3 arguments (beside the "[" and
"]" ones). That is use [ -f "$f" ] && ! [ -h "$f" ] instead of [
-f "$f" -a ! -f "$f" ] for instance.

That at least would keep you out of troubles with POSIX
conformant shells. Then for shells that are still not POSIX
conformant (such as the Bourne shell, or the BSD shs (ash) or
some old kshs).

- prefer [ -n "$string" ] over [ "$string" ]
- [ foo = "$var" ] over [ "$var" = foo ]
- [ "x$a" = "x$b" ] over [ "$a" = "$b" ]
- [ whatever != whatever ] over [ ! whatever = whatever ]

Note that the [[ ... ]] and (( ... )) contructs don't have such
issues, but those are not standard contructs.

-- 
Stéphane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: logical XOR

2007-06-29 Thread Stephane Chazelas

On Fri, Jun 29, 2007 at 10:21:01AM +0200, Robert Millan [ackstorm] wrote:
[...]
> > >   There's bitwise AND and bitwise OR, and logical AND and logical OR, but
> > >   for XOR there's only the bitwise version.  Would be nice if the logical
> > >   XOR would also be present (^^).
> > 
> > (!a) != (!b)
> > 
> > should work acceptably for the situations such an operator would be used.
> 
> I'm not sure what you mean.  The following:
> 
>   (true && ! false) || (false && ! true)
>   echo $?
> 
> works, whereas the following:
> 
>   (! true) != (! false)
>   echo $?
> 
> shows syntax error.
> 
> Is there any way to avoid redundancy in the shell script that needs this?  The
> code in question is already quite complex, so adding this redundancy makes it
> less readable.
[...]

Do you mean that for instance, you'd like a third command to be
executed if either but not both of two commands succeed?

cmd1; a=$?
cmd2; b=$?
(( !a != !b )) && cmd3

should do.

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: logical XOR

2007-06-29 Thread Stephane Chazelas

On Fri, Jun 29, 2007 at 12:08:22PM +0200, Robert Millan [ackstorm] wrote:
> On Fri, Jun 29, 2007 at 10:16:18AM +0100, Stephane Chazelas wrote:
> > 
> > Do you mean that for instance, you'd like a third command to be
> > executed if either but not both of two commands succeed?
> > 
> > cmd1; a=$?
> > cmd2; b=$?
> > (( !a != !b )) && cmd3
> > 
> > should do.
> 
> Thank you, that works.  Although I think ^^ would be more intuitive.  Let me
> know if you want a patch for that.
[...]

You could do something like:

xor() {
  previous_status=$?
  "$@"
  return "$((!$previous_status == !$?))"
}

cmd1; xor cmd2 && cmd3

-- 
Stéphane


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

Re: printf '\x' exit status is 0

2007-07-11 Thread Stephane Chazelas

On Tue, Jul 10, 2007 at 09:35:56PM -0400, Chet Ramey wrote:
> Poor Yorick wrote:
> > (on my debian system, bashbug reports for bash 2.05)
> > 
> > printf status is 0 but /usr/bin/printf is 1.  Is this the expected behavior?
> > 
> > [EMAIL PROTECTED]:~$ bash3
> > [EMAIL PROTECTED]:~$ bash3 --version
> > GNU bash, version 3.00.16(1)-release (i386-pc-linux-gnu)
> > Copyright (C) 2004 Free Software Foundation, Inc.
> > [EMAIL PROTECTED]:~$ printf '\x'
> > bash3: printf: missing hex digit for \x
> > [EMAIL PROTECTED]:~$ echo $?
> > 0
> 
> Bash treats unrecognized backslash escapes as characters to be printed
> and does not consider them an error.  \x without any hex digits doesn't
> have any special meaning, though, since it's probably not what's
> intended, bash prints a warning.
[...]

Note that both AT&T ksh and zsh output a NUL byte and return
success there. pdksh derivatives (pdksh, mksh, posh) output an
error message and return 1. ash derivatives output "\x" and
return success.

The behavior of printf upon \x is /unspecified/ as per POSIX, so
all the behaviors observed are valid. A portable application
shouldn't use printf '\x'. The octal representation
(\o, \oo or \ooo) is standard.

-- 
Stéphane

___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash

1 2 3 4 5 >

1 - 100 of 414 matches

Mail list logo