On Tuesday, 23 May 2023, 17:19:46 BST, Craig White <[email protected]> wrote:
> I was looking into how freetype maps character codes to glyph indices, and
> learned that there are many different formats the character map can be in,
> not to mention the one-to-many and many-to-one mappings that Werner mentioned.
> Will it be necessary to implement the reverse mapping separately for every
> cmap format?
Not sure why you need to/want to implement it in Freetype. glyph id is unique
per glyph. Some glyphs are not mapped in any character encodings e.g. "symbol
fonts with custom encoding vectors" <- there is even a name for such.
Perhaps it is best to STOP thinking about (unicode) characters. Glyphs are
shaped drawings with a glyph id, some of them for example, lignatures ("combo
characters" like "ff" , "etc"), which correspond to two (unicode) characters.
And in Arabic, almost every character have 2 to 4 glyph shapes, called isolated
forms and init/medi/fini forms.
I think I actually have a python program which does the reverse-map (for the
purpose of dropping some glyphs in the many-to-one scenario).
examples/cjk-multi-fix.py in my freetype-py fork (
https://github.com/HinTak/freetype-py/, you might need to switch to the
font-diag branch to see it if it is not not the default branch).
The opentype spec / and font tech was created to make looking up in the most
frequently used direction (from character encoding to glyph id) fast and easy.