> This is why we need to represent the text with some-
> thing more like a linked-list of objects where the
> top-level object represents an "on-screen character"
> which can be made up of one or more "codepoints" which
> in turn can be made up of one or more bytes.

We should not need to do this. We hold the raw string of the 
Unicode values, we pass that to the shaping engine, which returns 
to us the shapped string + some additional information about 
relationship between the rendered glyphs and the original glyphs; 
we use the rendered string to draw on screen and the extra info to 
navigate. This would not be difficult to do; we already have the raw 
<-> rendered string mechanism in fp_TextRun, all we need is to 
add the extra positional info to be able to navigate strings where 
multiple codepoints map to a single character (we already handle 
the case where multiple characters form single glyph, i.e., Arabic 
ligatures).

Tomas

Reply via email to