Re: excess braces ignored: bug or feature ?

2012-02-20 Thread Dan Douglas
On Sunday, February 19, 2012 04:25:46 PM Chet Ramey wrote:

> I assume you mean the first one.  It doesn't matter whether or not the
> variable is set as a side effect of the redirection -- it's in a
> subshell and disappears.
> 
> Chet

Forgot to mention though, It's possible in ksh there is no subshell created if 
you consider this:

$ : "$(&2;})"
1
$ : $(: $( echo ${.sh.subshell} >&2))
2

It even works with the subshell-less command substitution, but there's no 
typeset output, so either x is automatically unset, it's never set to begin 
with, or ${ &2 | :
0
 ~ $ : | { echo $BASH_SUBSHELL >&2; } | :   



1
 ~ $ : | ( echo $BASH_SUBSHELL >&2; ) | :   



1
 ~ $ : | ( ( echo $BASH_SUBSHELL >&2; ) ) | :   



2
 ~ $ : | { ( echo $BASH_SUBSHELL >&2; ) } | :   


   
2
 ~ $ : | { { echo $BASH_SUBSHELL >&2; } } | :   


   
1

-- 
Dan Douglas



Process group id of first command in command substitution (bash4 vs bash3)

2012-02-20 Thread Roman Rakus
I'm not sure if it's a bug or not, but there is change between old bash 
3.2 and bash 4.2.

When you run a script:
set -m
$(sleep 1; sleep 2)

in bash 4.2 the first sleep has same group id as parent shell. However 
in bash 3.2 it has different group id.


Is it bug or not? I'm not able to find documentation for this change. 
And seems that POSIX says nothing about it.


RR



Re: Process group id of first command in command substitution (bash4 vs bash3)

2012-02-20 Thread Chet Ramey
> I'm not sure if it's a bug or not, but there is change between old bash 
> 3.2 and bash 4.2.
> When you run a script:
> set -m
> $(sleep 1; sleep 2)
> 
> in bash 4.2 the first sleep has same group id as parent shell. However 
> in bash 3.2 it has different group id.
> 
> Is it bug or not? I'm not able to find documentation for this change. 
> And seems that POSIX says nothing about it.

How could this possibly matter?

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-20 Thread Chet Ramey
On 2/18/12 5:39 AM, John Kearney wrote:

> Bash Version: 4.2
> Patch Level: 10
> Release Status: release
> 
> Description:
>   Current u32toutf8 only encode values below 0x correctly.
> wchar_t can be ambiguous size better in my opinion to use
> unsigned long, or uint32_t, or something clearer.

Thanks for the patch.  It's good to have a complete implementation,
though as a practical matter you won't see UTF-8 characters longer
than four bytes.  I agree with you about the unsigned 32-bit int
type; wchar_t is signed, even if it's 32 bits, on several systems
I use.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Questionable code behavior in u32cconv?

2012-02-20 Thread Chet Ramey
On 2/18/12 7:07 AM, John Kearney wrote:
> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
> -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
> -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
> -I../bash/lib   -g -O2 -Wall
> uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
> 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
> 
> Bash Version: 4.2
> Patch Level: 10
> Release Status: release
> 
> Description:
> Now I may be misreading the code but it looks like the code relating
> to iconv is only checking the destination charset the first time, the
> code is executed.
> 
> as such breaking the following functionality.
> LC_CTYPE=C printf '\uff'
> LC_CTYPE=C.UTF-8 printf '\uff'
> 
> Repeat-By:
>   haven't seen the problem.

I can't reproduce it, even using C, zh_CN, and en_US.UTF-8, but I agree
that the static data should be reset when the locale, or at least
LC_CTYPE, changes.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?

2012-02-20 Thread Chet Ramey
On 2/19/12 5:07 PM, John Kearney wrote:
> Can somebody explain to me what u32tochar is trying to do?
> 
> It seems like dangerous code?
> 
> from the context i'm guessing it trying to make a hail mary pass at
> converting utf-32 to mb (not utf-8 mb)

Pretty much.  It's a big-endian representation of a 32-bit integer
as a character string.  It's what you get when you don't have iconv
or iconv fails and the locale isn't UTF-8.  It may not be useful,
but it's predictable.  If we have a locale the system doesn't know
about or can't translate, there's not a lot we can do.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



bug in stub_charset rollup diff of changes to unicode code.

2012-02-20 Thread John Kearney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
  stub_charset
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else if locale=UTF-8
   return UTF-8
  else
   return ASCII

should be
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else
   return locale
 because its output is only being used in iconv, so let it decide if the
locale makes sense.




   I've attached a diff of all my changes to the unicode code.
   Including
   renamed u2cconv to utf32tomb
   move special handling of ascii charcter to start of function and
remove related call wrapper code.
   tried to reationalize the code in utf32tomb so its easier to read and
understand what is happening.
   added utf32toutf16
   use utf32toutf16 in case wchar_t=2 with wctomb

  removed dangerious code that was using iconv_open (charset, "ASCII");
as fallback. pointless anyway as we already assign a ascii value if posible.

  added warning message if encode fails

always terminate mb output string.


haven't started to test these changes yet firstly would like to know
if these changes are acceptable, any observations, I'm still reviewing
it myself for consistency.

Plus can somebody tell me how this was tested originally? I've got
some ideas myself but would like to know what has already been done in
that direction.

Repeat-By:
  .

Fix:
diff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..3680419 100644
- --- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc)
*cp = '\\';
return 0;
  }
- - if (uvalue <= UCHAR_MAX)
- -   *cp = uvalue;
- - else
- -   {
- - temp = u32cconv (uvalue, cp);
- - cp[temp] = '\0';
- - if (lenp)
- -   *lenp = temp;
- -   }
+   temp = utf32tomb (uvalue, cp);
+   if (lenp)
+ *lenp = temp;
break;
 #endif

diff --git a/externs.h b/externs.h
index 09244fa..8868b55 100644
- --- a/externs.h
+++ b/externs.h
@@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int,
unsigned int));
 extern unsigned int fsleep __P((unsigned int, unsigned int));

 /* declarations for functions defined in lib/sh/unicode.c */
- -extern int u32cconv __P((unsigned long, char *));
+extern int utf32tomb __P((unsigned long, char *));

 /* declarations for functions defined in lib/sh/winsize.c */
 extern void get_new_window_size __P((int, int *, int *));
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..495d9c4 100644
- --- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -144,16 +144,10 @@ ansicstr (string, len, flags, sawc, rlen)
  *r++ = '\\';  /* c remains unchanged */
  break;
}
- -   else if (v <= UCHAR_MAX)
- - {
- -   c = v;
- -   break;
- - }
  else
{
- -   temp = u32cconv (v, r);
- -   r += temp;
- -   continue;
+ r += utf32tomb (v, r);
+ break;
}
 #endif
case '\\':
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..9a557a9 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -36,14 +36,6 @@

 #include 

- -#ifndef USHORT_MAX
- -#  ifdef USHRT_MAX
- -#define USHORT_MAX USHRT_MAX
- -#  else
- -#define USHORT_MAX ((unsigned short) ~(unsigned short)0)
- -#  endif
- -#endif
- -
 #if !defined (STREQ)
 #  define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0)
 #endif /* !STREQ */
@@ -54,13 +46,14 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
 #endif

 #ifndef HAVE_LOCALE_CHARSET
+static char CType[40]={0};
 static char *
 stub_charset ()
 {
@@ -69,6 +62,7 @@ stub_charset ()
   locale = get_locale_var ("LC_CTYPE");
   if (locale == 0 || *locale == 0)
 return "ASCII";
+  strcpy(CType, locale);
   s = strrchr (locale, '.');
   if (s)
 {
@@ -77,159 +71,230 @@ stub_charset ()
*t = 0;
   return ++s;
 }
- -  else if (STREQ (locale, "UTF-8"))
- -return "UTF-8";
   else
- -return "ASCII";
+return CType;
 }
 #endif

- -/* u32toascii ? */
 int
- -u32tochar (wc, s)
- - wchar_t wc;
+utf32_2_utf8 (c, s)
+ unsigned lon