Hi, Deri already followed up the conversation that was prompted by Tom's questions regarding Tibetan.
I'll attempt to steer the conversation away from Tibetan towards a more generic technical issue: processing ligatures (that's what Tom's problems boil down to). If we take the Tibetan syllable རྒྱ, romanized as rgya, with the components superscript r, baseform ga, subscript y, then what *looks* like a single glyph is in reality a sequence of three (!) elements: 1. U+0F62 "RA" (but with the ability to change shape when combined; in contrast to U+0F6A which looks absolutely the same in the character table but does *not* enter into ligatures), 2. U+0F92 "-GA", i.e. subjoined form of base letter U+0F42, and finally 3. U+0Fb1 "-ya", subjoined form of U+0F61 YA. All stacked vertically in one place. The same "TTT" (tiny Tibetan tower) can have an additional layer on top (for the vowels e, i, o) or below (for vowel u). Likewise, there is a base vowel sign for these four (absent any of these, the vowel a is assumed), but the correct height of the vowel glyph is taken care of by the font. It is also possible to have one canonical vowel in the character table but a whole series of vowel glyphs of different height in a private area of the font, not necessarily user-accessible. I haven't inspected the internal structure of the Tibetan fonts I use on my machine, but the syllable rgya is displayed properly when copied into a shell prompt, and e.g. in vim the key sequence g a reveals the composition and the code points. So I assume the font does all the shaping work, via its lookup tables. Now the question which is not language-specific: In how far can groff access these font-internal lookup tables? It appears that the "naive" approach does not trigger the ligature mechanism in the font, as demonstrated by Tom's and Deri's examples. Is it possible that every \[u0Fxx] is (perhaps invisibly) isolated, akin to putting every character in {f}{f}{l} if you want to make sure in TeX that no ligature will spring into action? I tried to test this hypothesis by making a minimal document, ff.roff .P ff \" generates ligature in PDF file \[u0066]\[u0066] \" I hoped to see something like ff, but get an error message Yet instead of producing the letter "f", \[u0066] generates an error message: "warning: special character '\f' not defined" Where is my mistake? I then tried the basic Latin range with other letters, like \[u0041], but get the message: "warning: special character '\A' not defined" Which looks as if the character code is translated correctly but the backslash "special character" component is newly introduced. Or is there a lower floor for the \[uxxxx] notation which I am not aware of? So, when typesetting "ff" or "ffi" in groff, will groff build or not build the ligature and request the glyph [ff] or [ffi] from the font, or could the font do that based on its own knowledge of ligatures via the appropriate lookup table? In other words, for a working implementation of Tibetan in groff, should I write a series conditional character substitutions, or is there a way send the characters to the device in such a way that the device and font know, here comes a ligature? Either way I am fine - a) accessing the font lookup table, or b) implement a comprehensive set of ligatures in groff. Best regards, Oliver. -- Dr. Oliver Corff mailto:oliver.co...@email.de