Re: Improper UTF-8 combining character handling

2007-06-12 Thread Andreas Schwab
Sean Burke <[EMAIL PROTECTED]> writes: > I've retried with 3.2-17 with the same results. Notably, the issue isn't > (and has not been) that all multibyte characters are handled properly. > Instead, sequences which contain combining characters seem to treat the > sequence inconsistently. For exampl

Re: Improper UTF-8 combining character handling

2007-06-12 Thread Sean Burke
I've retried with 3.2-17 with the same results. Notably, the issue isn't (and has not been) that all multibyte characters are handled properly. Instead, sequences which contain combining characters seem to treat the sequence inconsistently. For example, the character that represents D WITH DOT ABOV

Re: Improper UTF-8 combining character handling

2007-06-10 Thread Benno Schulenberg
Sean Burke wrote: > The Unicode normalization test data at > http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt > contains many sequences of this sort. > The first chara cter sequence, LATIN CAPITAL LETTER D WITH DOT > ABOVE, does produce this problem. > Paste it into the comma