>> when there is a unicode character for e.g. "not equal" (U+2260) >> why there is a combination of characters in groff_char(7) >> instead of unicode? Is it intended for ASCII output? > > 3. In case you are talking about the third column "Unicode" > in said table, which contains "u003D_0338" even though > groff actually produces U+2260: > That looks like a documentation bug to me. I'm not > sending a patch because there are many such composite > Unicode names in that column, so i suspect this is not > the only one mismatching reality.
It's rather a documentation bug. From groff's Info manual, section `Using Symbols': * A glyph representing more than a single input character is named 'u' COMPONENT1 '_' COMPONENT2 '_' COMPONENT3 ... Example: 'u0045_0302_0301'. For simplicity, all Unicode characters that are composites must be decomposed maximally (this is normalization form D in the Unicode standard); for example, 'u00CA_0301' is not a valid glyph name since U+00CA (LATIN CAPITAL LETTER E WITH CIRCUMFLEX) can be further decomposed into U+0045 (LATIN CAPITAL LETTER E) and U+0302 (COMBINING CIRCUMFLEX ACCENT). 'u0045_0302_0301' is thus the glyph name for U+1EBE, LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE. * groff maintains a table to decompose all algorithmically derived glyph names that are composites itself. For example, 'u0100' (LATIN LETTER A WITH MACRON) is automatically decomposed into 'u0041_0304'. Additionally, a glyph name of the GGL is preferred to an algorithmically derived glyph name; groff also automatically does the mapping. Example: The glyph 'u0045_0302' is mapped to '^E'. >From `groff_char.man', section REFERENCE, which explains the table fields: Unicode is the glyph name used in composite glyph names. The names in the Unicode column look like u0021 or u0041_0300. In groff, the corresponding Unicode characters can be constructed by adding a backslash and a pair of square brackets, for example \[u0021] or \[u0041_0300]. The important bit is *glyph name*. I've decided to use always use Unicode normalization form D for glyph names, except there is a groff entity name available, like \[!=] in the particular case, which is preferred. Patches are welcome to make this easier to understand in both `groff.info' and `groff_char.man'. Werner