Re: Function definition syntax inconsistencies

2008-06-05 Thread Stephane Chazelas
On Thu, Jun 05, 2008 at 06:31:36AM +0200, Jan Schampera wrote:
[...]
> The function definition is allowed in 3 forms:
> 1. NAME() ...

That's the Bourne and POSIX syntax

> 3. function NAME ...

That's the ksh syntax. It's probably only there for
compatibility with ksh (note that bash doesn't make the same
distinctions as AT&T ksh does when using that syntax). ksh
doesn't support function foo (echo), so it's fine with bash not
to either, I'd say.

> 2. function NAME() ...
[...]

Not sure where that syntax comes from. It could be that it works
by accident or it could be that someone thought that it was a
good idea to have yet another syntax. It happens to be the same
syntax as awk.

In anycase, given that those syntaxes are not different
functionnaly (contrary to in ksh), I think it would be a good
idea to have the manual say that only "1." should be used as
it's the standard one.

I suspect this:
$ sh -c 'function() { echo a; }; function'
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `function() { echo a; }; function'

breaks POSIX conformance, BTW (but then it's the same for AT&T
ksh, pdksh, zsh).

-- 
Stéphane




Re: Function definition syntax inconsistencies

2008-06-05 Thread Andreas Schwab
Stephane Chazelas <[EMAIL PROTECTED]> writes:

> I suspect this:
> $ sh -c 'function() { echo a; }; function'
> sh: -c: line 0: syntax error near unexpected token `('
> sh: -c: line 0: `function() { echo a; }; function'
>
> breaks POSIX conformance, BTW (but then it's the same for AT&T
> ksh, pdksh, zsh).

POSIX allows function to be treated as a reserved word, even if it does
not attach any meaning to it.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Re: Function definition syntax inconsistencies

2008-06-05 Thread Stephane Chazelas
On Thu, Jun 05, 2008 at 10:07:51AM +0200, Andreas Schwab wrote:
> Stephane Chazelas <[EMAIL PROTECTED]> writes:
> 
> > I suspect this:
> > $ sh -c 'function() { echo a; }; function'
> > sh: -c: line 0: syntax error near unexpected token `('
> > sh: -c: line 0: `function() { echo a; }; function'
> >
> > breaks POSIX conformance, BTW (but then it's the same for AT&T
> > ksh, pdksh, zsh).
> 
> POSIX allows function to be treated as a reserved word, even if it does
> not attach any meaning to it.
[...]

You're right. Shame on me, I had said the exact same thing (that
POSIX allowed "function" to be a keyword) less than 1 week ago
in a bug report to posh
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483860#20).

Looks like my skull is getting more and more porous...

Sorry about that.
Stéphane




[doc] read -t and sockets, devices...

2008-06-05 Thread Stephane Chazelas
Configuration Information [Automatically generated, do not change]:
Machine: i486
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i486' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib   -g -O2 -Wall
uname output: Linux sc.homeunix.net 2.6.25-rc8 #1 PREEMPT Fri Apr 4 08:56:07 
BST 2008 i686 GNU/Linux
Machine Type: i486-pc-linux-gnu

Bash Version: 3.2
Patch Level: 39
Release Status: release

Description:
Hiya,

the doc says:

-t timeout Cause read to time out and return failure if
a complete line of input is not read within timeout
seconds. This option has no effect if read is not
reading input from the terminal or a pipe.

However, I find that it does have an effect on Unix or
TCP sockets, on /dev/random and other terminals than
"the" terminal.


So maybe a better wording could be: "This option has no
effect on regular files"?

Also, it may be good to specify that, if the timeout is
reached, bash will consume the input but will not put
that consumed input into the variable:

$ (printf ; sleep 2; printf '\n') | bash -c 'read -t1 a; echo "<$a>"; 
cat'
<>


Also, is their a way to differenciate between "timeout"
and "eof"? Looks like the exit status is 1 in both
cases.

Like in:

sleep 1 | bash -c 'read -t2'
and
sleep 2 | bash -c 'read -t1'


Best regards,
Stephane




range match works incorrect

2008-06-05 Thread Roman Rakus
When trying to match files [a-z] bash find files A-Z, depending on 
LC_COLLATE.
The mistake is in usage of strcoll()/wcscoll(). It has nothing to do 
with ranges. Instead should be used fnmatch().
I can try to change this behavior. Or is this planned/done for next bash 
release?
begin:vcard
fn:Roman Rakus
n:Rakus;Roman
org:Red Hat;BaseOS
adr:;;;Brno;;;Czech Republic
email;internet:[EMAIL PROTECTED]
title:Associate software engineer
tel;cell:+420 774 891 861
x-mozilla-html:FALSE
version:2.1
end:vcard



Re: range match works incorrect

2008-06-05 Thread Matthew Woehlke

Roman Rakus wrote:
When trying to match files [a-z] bash find files A-Z, depending on 
LC_COLLATE.
The mistake is in usage of strcoll()/wcscoll(). It has nothing to do 
with ranges. Instead should be used fnmatch().
I can try to change this behavior. Or is this planned/done for next bash 
release?


How is that incorrect? Depending on LC_COLLATE, the range '[a-d]' might 
mean 'abcd', 'aAbBcCd', 'aAàÀáÁâÂãÃäÄåÅæÆbBcCçÇd', etc.


If you want C semantics, use 'LC_COLLATE=C'.

--
Matthew
Somewhere there is a pun so egregious that uttering it will incite 
murderous rage in all who hear it. If anyone knows where it is, please 
let me know.






Re: range match works incorrect

2008-06-05 Thread Roman Rakus

Matthew Woehlke wrote:

Roman Rakus wrote:
When trying to match files [a-z] bash find files A-Z, depending on 
LC_COLLATE.
The mistake is in usage of strcoll()/wcscoll(). It has nothing to do 
with ranges. Instead should be used fnmatch().
I can try to change this behavior. Or is this planned/done for next 
bash release?


How is that incorrect? Depending on LC_COLLATE, the range '[a-d]' 
might mean 'abcd', 'aAbBcCd', 'aAàÀáÁâÂãÃäÄåÅæÆbBcCçÇd', etc.


If you want C semantics, use 'LC_COLLATE=C'.

I think in range [a-d] shouldn't be any upper char. These are 2 
different things. Collate orders and range match. Try to use glob and 
you will see difference.

This is really not good behavior of bash.
begin:vcard
fn:Roman Rakus
n:Rakus;Roman
org:Red Hat;BaseOS
adr:;;;Brno;;;Czech Republic
email;internet:[EMAIL PROTECTED]
title:Associate software engineer
tel;cell:+420 774 891 861
x-mozilla-html:FALSE
version:2.1
end:vcard



$IFS and "[EMAIL PROTECTED]:offset}"

2008-06-05 Thread Stephane Chazelas
Configuration Information [Automatically generated, do not change]:
Machine: i486
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i486' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib   -g -O2 -Wall
uname output: Linux sc.homeunix.net 2.6.25-rc8 #1 PREEMPT Fri Apr 4 08:56:07 
BST 2008 i686 GNU/Linux
Machine Type: i486-pc-linux-gnu

Bash Version: 3.2
Patch Level: 39
Release Status: release

Hiya,

$ bash -c 'printf "%s\n" "${@:2}"' x 1 2 "3 4" 5
2
3 4
5
$ bash -c 'IFS=a; printf "%s\n" "${@:2}"' 0 1 2 "3 4" 5
2 3 4 5

I don't understand why $IFS would have any influence here. The
behavior differs from ksh.

It seems that you need to have " " in IFS or IFS being unset for
it to work as I would expect.

Also, this:

$ bash -c 'printf "%s\n" "${@:1}"' x 1 2 "3 4" 5
1
2
3 4
5
$ bash -c 'a=("$@"); printf "%s\n" "[EMAIL PROTECTED]:1}"' x 1 2 "3 4" 5
2
3 4
5

I find is quite confusing.

The behavior is the same in ksh, but in ksh ${@:0:1} expands to
$0 which makes it more understandable ($0 has its meaning in
functions as well in ksh which makes it somehow consistent).

In bash, ${@:0:1} and ${@:1:1} expand to the same thing ($1). Is
all that documented (I couldn't find it via a quick scan of the
man page)?

Best regards,
Stephane




Re: range match works incorrect

2008-06-05 Thread Roman Rakus

Roman Rakus wrote:

Matthew Woehlke wrote:

Roman Rakus wrote:
When trying to match files [a-z] bash find files A-Z, depending on 
LC_COLLATE.
The mistake is in usage of strcoll()/wcscoll(). It has nothing to do 
with ranges. Instead should be used fnmatch().
I can try to change this behavior. Or is this planned/done for next 
bash release?


How is that incorrect? Depending on LC_COLLATE, the range '[a-d]' 
might mean 'abcd', 'aAbBcCd', 'aAàÀáÁâÂãÃäÄåÅæÆbBcCçÇd', etc.


If you want C semantics, use 'LC_COLLATE=C'.

I think in range [a-d] shouldn't be any upper char. These are 2 
different things. Collate orders and range match. Try to use glob and 
you will see difference.

This is really not good behavior of bash.
I have looked at 
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05 
and there is written:
7. In the POSIX locale, a range expression represents the set of 
collating elements that fall between two elements in the collation 
sequence, inclusive. In other locales, a range expression has 
unspecified behavior: strictly conforming applications shall not rely on 
whether the range expression is valid, or on the set of collating 
elements matched. A range expression shall be expressed as the starting 
point and the ending point separated by a hyphen ( '-' ).



So if I understand it well, the behavior of bash isn't bad. But there is 
question, if this behavior we would like to change or not? I think more 
predictable is glob()-like behavior then strcoll().
begin:vcard
fn:Roman Rakus
n:Rakus;Roman
org:Red Hat;BaseOS
adr:;;;Brno;;;Czech Republic
email;internet:[EMAIL PROTECTED]
title:Associate software engineer
tel;cell:+420 774 891 861
x-mozilla-html:FALSE
version:2.1
end:vcard



Re: Function definition syntax inconsistencies

2008-06-05 Thread Jan Schampera
Thanks for your replies. I didn't want to raise a portability and POSIX
discussion. I know about that. I was referring to either a bug in the
Bash manual or a bug in the Bash code, nothing else.

Let's try to rephrase that a bit (the topic is not POSIX nor portability):

The Bash manual allows the syntax of the function definition as follows:

-
[ function ] name () compound-command [redirection]
This defines a function named name. The reserved word function is
optional. If the function reserved word is supplied, the parentheses
   are optional. [...]
-

That means for me, extracted, I have the following options:

1. name ()
2. function name ()
3. function name

They all work fine, EXCEPT:

When you use the third form shown above and use the subshell-grouping
compound command '(...)' as function body, then it doesn't pass the parser:

-
$ function name (echo)
bash: syntax error near unexpected token `echo'
-

That's the reason I think there's a bug, either in the description or in
the code. Since the rest of the behaviour of function definitions is
very consistent as described in the manual, I assume it's a bug in the code.

That's all about this bugreport, it wasn't about Bash allowing/not
allowing any standard/nonstandard features, it simply was about Bash not
being consistent in this one case.

J.




Re: range match works incorrect

2008-06-05 Thread Jan Schampera
Roman Rakus wrote:

> I think in range [a-d] shouldn't be any upper char. These are 2
> different things. Collate orders and range match. Try to use glob and
> you will see difference.
> This is really not good behavior of bash.

This is and always has been locale-dependent. The collation sequence
defined with the left and right borders you give with [x-y] matters.
It's the very same behaviour you have with BRE/ERE.

Not to mention the POSIX requirement.

J.




Re: range match works incorrect

2008-06-05 Thread Jan Schampera
Roman Rakus wrote:

> sequence, inclusive. In other locales, a range expression has
> unspecified behavior: strictly conforming applications shall not rely on
> whether the range expression is valid, or on the set of collating
> elements matched. A range expression shall be expressed as the starting
> point and the ending point separated by a hyphen ( '-' ).
>
>
> So if I understand it well, the behavior of bash isn't bad. But there is
> question, if this behavior we would like to change or not? I think more
> predictable is glob()-like behavior then strcoll().

Okay, I have to say sorry, I didn't see that exception about the
behaviour in non_POSIX locales.

I think the rationale about it is worth to read:
http://www.opengroup.org/onlinepubs/009695399/xrat/xbd_chap09.html
(scroll down to "RE Bracket Expression")

IMHO this exception has clear historical reasons.

J.






Re: range match works incorrect

2008-06-05 Thread Chet Ramey

Jan Schampera wrote:


I think the rationale about it is worth to read:
http://www.opengroup.org/onlinepubs/009695399/xrat/xbd_chap09.html
(scroll down to "RE Bracket Expression")

IMHO this exception has clear historical reasons.


The discussion in the rationale captures the essentials of the dispute
the last time this came up as a bash bug.  The invented `collating
element order' and the Posix requirement to use it are useless without
a standard API to query it.  Without that, one is left with either the
standard APIs that understand locales (strcoll, etc.), or native order.
Bash and glibc made different choices here.

Roman is arriving late to this fight. :-)

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer

Chet Ramey, ITS, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/