> This is why we need to represent the text with some- > thing more like a linked-list of objects where the > top-level object represents an "on-screen character" > which can be made up of one or more "codepoints" which > in turn can be made up of one or more bytes.
We should not need to do this. We hold the raw string of the Unicode values, we pass that to the shaping engine, which returns to us the shapped string + some additional information about relationship between the rendered glyphs and the original glyphs; we use the rendered string to draw on screen and the extra info to navigate. This would not be difficult to do; we already have the raw <-> rendered string mechanism in fp_TextRun, all we need is to add the extra positional info to be able to navigate strings where multiple codepoints map to a single character (we already handle the case where multiple characters form single glyph, i.e., Arabic ligatures). Tomas
