Hi Erich,
> When I enter unicode, like:
>
> ÄÖÜ SS ÒÓÔÕŎŌ Ç äöü ß òóôõŏō ç
>
> ...and process them with pdfmom, they show up perfectly. But if I
> include the same characters in a file with the .INCLUDE macro, they
> disappear.
Those are Unicode codepoints, but what encoding are you using to
represent them in a file as bytes? Is it UTF-8? Only `Ŏ', U+014E,
isn't in ISO 8859-1, AKA Latin1.
> Processed with -P-bcu -Tutf8, they show up like wrong encoded strings.
troff(1) reads files of ISO 8859-1. It sounds like, in this particular
test, you're giving it bytes of UTF-8 that it's trying to interpret as
ISO-8859-1.
U+00A3 is a `£'. In UTF-8, it's two bytes; the 0a is the linefeed.
$ hd <<<£
00000000 c2 a3 0a |...|
iso-8859-1(7) shows c2 is `Â' and a3 is `£' and that's how groff
interprets these bytes.
$ groff -Tutf8 <<<£ | grep .
£
> I tried, in vain, the following pipe:
>
> soelim example.mom | preconv -eutf8 |
> groff -mom -Tutf8 -P-bcu > example.txt
As Denis said, soelim(1) looks for `.so' lines. `.INCLUDE' means
nothing to it.
http://git.savannah.gnu.org/cgit/groff.git/tree/src/preproc/soelim/soelim.cpp#n169
You could try replacing `.INCLUDE' with `.so'.
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy