Re: commit: abi: UTF8String class

Martin Sevior Sun, 21 Apr 2002 07:35:03 -0700

> > 
> > UTF-8 is great for communicating between the
> > piecetable and the widgets. I
> > think we should definately do this. What I don't
> > want is for us to store
> > our text as UTF-8 in the piecetable. We have a *LOT*
> > of code that expects
> > that every position in the piecetable corresponds to
> > an extra letter of text. 
> 
> How is this going to work for languages that need
> combining characters?  Isn't it going to need to be
> changed anyway?  Isn't now the time to do this
> re-design?


I don't understand this. Doesn't every glyph have a unique unicode code
point? If so we still have a one-to one mapping of glyph to text location.

> 
> > What I think we should do is store our unicode as
> > UT_uint32 in the
> > piecetable which can then be randomly accessed the
> > same way we do things now.
> 
> To randomly access what the user sees as a character
> or to randomly acces what is internally one codepoint?

OK I don't understand. Are you saying that two code points in a row map to
a different glph? If so why not just insert the code point for this glyph?

> These are not the same.  But I don't know the
> piecetable either so maybe it is the right thing to
> do.
> As long as we are thinking about it.

Certainly the structure of the code makes lots of assumptions of one
PT_DocPosition, one glyph. If unicode was at all sane this should not be a
problem. Are you telling me that unicode is not sane and that certain
glyphs can only be generated if two 32 bit numbers are presented
consecutively?

Cheers

Martin

Re: commit: abi: UTF8String class

Reply via email to