Hi Erich, > When I enter unicode, like: > > ÄÖÜ SS ÒÓÔÕŎŌ Ç äöü ß òóôõŏō ç > > ...and process them with pdfmom, they show up perfectly. But if I > include the same characters in a file with the .INCLUDE macro, they > disappear.
Those are Unicode codepoints, but what encoding are you using to represent them in a file as bytes? Is it UTF-8? Only `Ŏ', U+014E, isn't in ISO 8859-1, AKA Latin1. > Processed with -P-bcu -Tutf8, they show up like wrong encoded strings. troff(1) reads files of ISO 8859-1. It sounds like, in this particular test, you're giving it bytes of UTF-8 that it's trying to interpret as ISO-8859-1. U+00A3 is a `£'. In UTF-8, it's two bytes; the 0a is the linefeed. $ hd <<<£ 00000000 c2 a3 0a |...| iso-8859-1(7) shows c2 is `Â' and a3 is `£' and that's how groff interprets these bytes. $ groff -Tutf8 <<<£ | grep . £ > I tried, in vain, the following pipe: > > soelim example.mom | preconv -eutf8 | > groff -mom -Tutf8 -P-bcu > example.txt As Denis said, soelim(1) looks for `.so' lines. `.INCLUDE' means nothing to it. http://git.savannah.gnu.org/cgit/groff.git/tree/src/preproc/soelim/soelim.cpp#n169 You could try replacing `.INCLUDE' with `.so'. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy