Configuration Information [Automatically generated, do not change]: Machine: i686 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i686' -DCONF_OSTYPE='linu x-gnu' -DCONF_MACHTYPE='i686-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/ local/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./incl ude -I./lib -g -O2 uname output: Linux shan 2.6.24.5-smp #2 SMP Wed Apr 30 13:41:38 CDT 2008 i686 I ntel(R) Celeron(R) M processor 1.40GHz GenuineIntel GNU/Linux Machine Type: i686-pc-linux-gnu
Bash Version: 3.2 Patch Level: 0 Release Status: release Description: When there are multibyte characters in an element of array, the result of ${#name[subscript]} will be incorrect. It will be evaled to numbers of bytes, but not numbers of characters. Repeat-By: This can be reproduced by: 1. a[0]=你好 2. echo ${#a[0]} There are only two chinese characters, but the result is 6. Fix: Following patch may be helpful. --- subst.c 2008-07-06 15:47:14.000000000 +0800 +++ bash-3.2/subst.c 2008-07-06 15:47:39.000000000 +0800 @@ -4763,7 +4763,7 @@ else t = (ind == 0) ? value_cell (var) : (char *)NULL; - len = STRLEN (t); + len = MB_STRLEN (t); return (len); } #endif /* ARRAY_VARS */