Improper UTF-8 combining character handling

2007-06-10 Thread Sean Burke
Configuration Information [Automatically generated, do not change]:
Machine: i686
OS: linux-gnu
Compiler: i686-pc-linux-gnu-gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i686'
-DCONF_OSTYPE='linu
x-gnu' -DCONF_MACHTYPE='i686-pc-linux-gnu' -DCONF_VENDOR='pc'
-DLOCALEDIR='/usr/
share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I.
-I./include -I
./lib   -O2 -march=prescott -fomit-frame-pointer -pipe
uname output: Linux morrigan 2.6.20-gentoo-r8-mactel #4 SMP PREEMPT Sat
May 12 1
0:35:03 MDT 2007 i686 Genuine Intel(R) CPU1400  @ 1.83GHz
GenuineInt
el GNU/Linux
Machine Type: i686-pc-linux-gnu

Bash Version: 3.2
Patch Level: 15
Release Status: release

Description:
When using a UTF-8 combining character sequence, there is a
disparity be
tween what is considered a character for display and for editing. The
entire seq
uence will be treated as a single character for the purpose of editing,
but each
 glyph that is part of the sequence is treated separately. This causes
some glyp
hs to not be removed when deleting characters or for the cursor to be
visually i
n the wrong place.

Repeat-By:
The Unicode normalization test data at
http://www.unicode.org/Public/UNI
DATA/NormalizationTest.txt contains many sequences of this sort. The
first chara
cter sequence, LATIN CAPITAL LETTER D WITH DOT ABOVE, does produce this
problem.
 Paste it into the commandline, then backspace through it. The problem
should be
 reproduced immediately.

Fix:
Glyphs and character sequences should be treated consistently.
With comb
ining character sequences, it would most likely to be preferable to
treat each c
haracter in the sequence separately to allow for more precise editing,
though th
ere may be other issues I'm unaware of.


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash


Re: Improper UTF-8 combining character handling

2007-06-10 Thread Benno Schulenberg
Sean Burke wrote:
> The Unicode normalization test data at
> http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt 
> contains many sequences of this sort. 
> The first chara cter sequence, LATIN CAPITAL LETTER D WITH DOT 
> ABOVE, does produce this problem.
>  Paste it into the commandline, then backspace through it. The
> problem should be  reproduced immediately.

Cannot reproduce it with bash-3.2-17.  Please retry with patch level 
17.  Patch 16 specifically addresses multibyte characters.

Benno


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash


Redisplay bug with wrapping prompt

2007-06-10 Thread schwab
Configuration Information [Automatically generated, do not change]:
Machine: ia64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='ia64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='ia64-unknown-linux-gnu' 
-DCONF_VENDOR='unknown' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' 
-DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -O2 -g
uname output: Linux sykes 2.6.18.2-33-default #1 SMP Mon Nov 27 11:46:27 UTC 
2006 ia64 ia64 ia64 GNU/Linux
Machine Type: ia64-unknown-linux-gnu

Bash Version: 3.2
Patch Level: 17
Release Status: release

Description:
If the prompt contains invisible characters and its rendered size
is wider then the terminal then readline fails to redisplay
correctly.  That is especially visible when scrolling through
the history.

Repeat-By:
$ mkdir 
/tmp/12345678901234567890123456789012345678901234567890123456789012345678901234567890
$ cd 
/tmp/12345678901234567890123456789012345678901234567890123456789012345678901234567890
$ PS1="\[$(tput bold)\]\w\\\$\[$(tput sgr0)\] "


___
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash