Hi Dave,
yes, I did have a look at that section of the groff documentation, and I
must confess that I read the text as non-exhaustive, meaning the five
specific ligatures are built-in, with the option to increase the
repertoire of ligatures.
Never mind, character and string substitutions/interpolations are very
powerful in groff, certainly powerful enough to solve the Tibetan
ligature problem.
I'll inspect the font description files, thank you for the hint.
Best regards,
Oliver.
On 22/01/2024 02:47, Dave Kemper wrote:
On 1/21/24, Oliver Corff via<groff@gnu.org> wrote:
Now the question which is not language-specific: In how far can groff
access these font-internal lookup tables? It appears that the "naive"
approach does not trigger the ligature mechanism in the font, as
demonstrated by Tom's and Deri's examples.
Is it possible that every \[u0Fxx] is (perhaps invisibly) isolated, akin
to putting every character in {f}{f}{l} if you want to make sure in TeX
that no ligature will spring into action?
It's much simpler than that: groff supports only five specific
ligatures: fi, fl, ff, ffi, and ffl. See section 5.19.8 (Ligatures
and Kerning) of the 1.23 version of the info manual. (Curiously, a
more recent revision of this section downplays the significance of
this limitation by citing two specific ligatures that aren't supported
and calling them "archaic.")
There's a feature request open (http://savannah.gnu.org/bugs/?64344)
to remove this limitation, but no one is currently working on it.
The mildly good news is that groff can access any glyph in a font,
whether or not groff recognizes it as a ligature. For instance, the
Linux Libertine font defines a ligature for "Qu". Groff won't invoke
it automatically, but looking in the font description file reveals
that this character is named u0051_0075, so groff can access it with
the escape \[u0051_0075].
Some glyphs in the font description file may not have names, however
(indicated by the first column of its entry being "---"), but groff
can produce even unnamed glyphs in a font with its \N escape.
Groff's .char request can make the syntax less clunky (e.g., for the
Qu ligature cited above, you could say ".char Qu \[u0051_0075]"), but
until its native ligature handling is expanded beyond its current
five, you'll still want a custom preprocessor (e.g., to change every
"Qu" in your input text to "\[Qu]" for that .char definition to work).
Yet instead of producing the letter "f", \[u0066] generates an error
message: "warning: special character '\f' not defined"
Where is my mistake?
This seems to be a groff bug: I reported it in
http://savannah.gnu.org/bugs/?63334 but it's not a high priority.
The reason it's not a high priority is that groff does not claim to
support representing ASCII characters in \[u00xx] format. Even so,
groff isn't correctly parsing here, because there should be no way for
the sequence "\[u0066]" to translate to "\f": the entire string
"\[u0066]" should either translate to "f", or be undefined.
--
Dr. Oliver Corff
Wittelsbacherstr. 5A
10707 Berlin
G E R M A N Y
Tel.: +49-30-85727260
Mail:oliver.co...@email.de