Interesting. On Wed, Aug 5, 2020 at 8:55 AM Richard Morse <[email protected]> wrote:
> Hi! The issue arises before it even gets to the PostScript. > > If you run the following commands: > > .do xflag 3 > .lc_ctype UTF-8 > .fp 5 Symbola Symbola ttf > .ft Symbola > ❊ works > .sp > 🂡 char > .sp > \U'1F0A1' uesc > .sp > \[u1F0A1] name > .sp > > > Through Heirloom as `troff test.roff | less`, you can see that the output > is (in part, once the heading is all set up): > > H72000 > V12000 > CPSspoked8teardroppropellerstar > wh11510cw > h7670co > h5140cr > h4010ck > h5560cs > n12000 0 > H72000 > V36000 > h6660cc > h4490ch > h5760ca > h5220cr > n12000 0 > H72000 > V60000 > h6660cu > h5760ce > h4550cs > h3920cc > n12000 0 > H72000 > V84000 > CPSu1F0A1 > wh11270cn > h5760ca > h5220cm > h8660ce > n12000 0 > > You’ll notice that the star character, which works in the PDF, and the > named character (remember that, inside the font file, u1F0A1 is the > character name) both show up in ‘CPS’ statements. But the two other places > you would expect to see something (from the actual character and the \U > escape), it is entirely missing. You have the ‘H72000’ command, the ‘V’ > command (with the vertical offset), and then it goes immediately into the > latin text (seemingly without even including the space that should exist?). > > So for whatever reason, it isn’t seeing the character as something that > should be output. > > Ricky > > > On Aug 5, 2020, at 1:30 AM, T. Kurt Bond <[email protected]> wrote: > > > > Looking at the postscript output there is a "/uni1F0A1 9429 def" and a > "/uni1F10A" in a "/Encoding-@15@36 [...] def"; is that part of the font > machinery? (I'm sadly ignorant of PostScript, alas.) > > > > Looking at troff/troff.d/otf.c I see that there is a struct WGL that > contains female and male entries. At the beginning of the struct is a > comment that consists of "/* WGL4 */". Googling that led to Windows Glyph > List 4. Taking a leap, I added the unicode characters FEMALE SIGN and MALE > SIGN to my test document. Those show up fine in the final PDF output. > Maybe this is connected? At this point I suspect without much evidence > that characters that are not in the StandardStrings array, the > MacintoshStrings array, or the WGL array don't get output. Maybe. I'll > have to investigate some more. > > > > On Tue, Aug 4, 2020 at 11:10 PM Richard Morse <[email protected]> wrote: > > Hm. Just for my edification, I tried a few things. > > > > I’m on a Mac, and I don’t know when I compiled Heirloom troff, but it > was a year or two ago, so something things may be different. > > > > I downloaded the Symbola font from fontlibrary.org. The version I got > was .ttf, not .otf. > > > > The various things that you tried did not work for me either. \[u1F0A1] > did work, but that’s because (according to fret, at least), that’s the > font’s internal name for the symbol, which is not guaranteed to be true > across all fonts, so you can’t really use that for a “fallback” system. > > > > Looking at the output of troff without going through dpost, it looks > like it is completely ignoring the character. I tried explicitly setting > LC_CTYPE to ‘en_US.UTF-8’ and ‘UTF-8’ (both in the terminal, and using the > .lc_ctype command), but that had no effect. > > > > I wonder if troff has a compiled in list of unicode characters that it > understands, and if you try to use one it deems invalid it just ignores it? > (This may be borne out by > https://github.com/n-t-roff/heirloom-doctools/blob/master/troff/troff.d/unimap.c > , but I don’t really know enough about the code to be certain.) > > > > Ricky > > > > > On Aug 4, 2020, at 10:14 PM, T. Kurt Bond <[email protected]> wrote: > > > > > > In Emacs M-x describe-coding-system tells me the coding system for > saving the buffer is utf-8-unix. I don't have any LC_* environment > variables set, but LANG=en_US.UTF-8. > > > > > > I'm not very knowledgeable about the insides of Unicode fonts, > unfortunately. > > > > > > On Tue, Aug 4, 2020 at 4:27 PM Richard Morse <[email protected]> wrote: > > > Huh. I’m afraid I’m out of my depth then; you might check and see if > your LC_* environment variables are set to something incompatible with > utf-8 (or, maybe, check and make sure the file in UTF-8, not UCS-16 or > something if you’re on Windows), but hopefully someone with more experience > and knowledge will speak up… > > > > > > Ricky > > > > > > > On Aug 4, 2020, at 3:59 PM, T. Kurt Bond <[email protected]> > wrote: > > > > > > > > And if I add "and explicit unicode character reference \U'1F0A1'" to > the > > > > file, that character doesn't show up either. > > > > > > > > On Tue, Aug 4, 2020 at 2:47 PM Richard Morse <[email protected]> wrote: > > > > > > > >> According to the Heirloom Troff manual, I think that you cannot just > > > >> insert Unicode characters (although maybe if your LC* environment > variables > > > >> are set correctly, you can?). It says: > > > >> > > > >>> Both nroff and troff allow references to specific Unicode > characters > > > >> with the \U'X' escape sequence; > > > >>> it causes the character at position U+X to be printed (X is a > > > >> hexadecimal number). For troff, > > > >>> it is required that this character is available in one of the fonts > > > >> mounted at this point. > > > >>> As an example, \U'20AC' prints the Euro character €. When register > .g is > > > >> set to 1 Unicode > > > >>> characters can also be accessed with \[uXXXX] where XXXX is a four > digit > > > >> hexadecimal number. > > > >> > > > >> So I think you would need to use `\U'1F0A1'` for the character to > show up? > > > >> > > > >> Ricky > > > >> > > > >> > > > >>> On Aug 4, 2020, at 12:28 PM, T. Kurt Bond <[email protected]> > wrote: > > > >>> > > > >>> (The heirloom-doctools README.md > > > >>> < > https://github.com/n-t-roff/heirloom-doctools/blob/master/README.md> > > > >> says > > > >>> to ask Heirloom doctools questions on this list.) > > > >>> > > > >>> I'd like to use the Symbola font in Heirloom troff. I tried the > > > >> following: > > > >>> > > > >>> .do xflag 3 > > > >>> .\" fp 5 Optima Optima-Regular ttf > > > >>> .fp 5 Symbola Symbola otf > > > >>> .LP > > > >>> Here is some normal text. > > > >>> .\" PLAYING CARD ACE OF SPACES is Unicode 0x1F0A1 > > > >>> .ft Symbola > > > >>> 🂡 And some normal text. ❊ > > > >>> .ft P > > > >>> More normal text. > > > >>> > > > >>> That's a literal PLAYING CARD ACE OF SPADES Unicode character at > the > > > >> start > > > >>> of the line between the two .ft requests. That character does not > show > > > >> up > > > >>> in the troff output, even through the EIGHT TEARDROP-SPOKED > PROPELLER > > > >>> ASTERISK Unicode character at the end of the line *does* show up, > > > >>> as CPSuni274A where the CPS<name> outputs the character of that > name. > > > >> The > > > >>> Symbola font is embedded in the PDF output (created from the > PostScript > > > >>> output), and the text "And some normal text" and the EIGHT > > > >> TEARDROP-SPOKED > > > >>> PROPELLER ASTERISK Unicode character are in the Symbola font in > the troff > > > >>> output. > > > >>> > > > >>> However, if I manually add a CPSuni1F0A1 to the troff output, > *that* > > > >> character > > > >>> *does* show up. > > > >>> > > > >>> Any ideas as to why the literal PLAYING CARD ACE OF SPADES Unicode > > > >>> character in the document source is being ignored and not written > to the > > > >>> troff output? > > > >>> > > > >>> I actually have a document that needs to use the PLAYING CARD ACE > OF > > > >> SPADES > > > >>> Unicode character. The ultimate goal is to have the Symbola font > used > > > >> as a > > > >>> fallback font, which should happen automatically in Heirloom > troff, since > > > >>> it searches all the fonts when a font is missing a character, but > I made > > > >>> the example use the Symbola font directly because that shows the > problem > > > >>> directly. > > > >>> > > > >>> -- > > > >>> T. Kurt Bond, [email protected], https://tkurtbond.github.io > > > >> > > > >> > > > > > > > > -- > > > > T. Kurt Bond, [email protected], https://tkurtbond.github.io > > > > > > > > > > > > -- > > > T. Kurt Bond, [email protected], https://tkurtbond.github.io > > > > > > > > -- > > T. Kurt Bond, [email protected], https://tkurtbond.github.io > > -- T. Kurt Bond, [email protected], https://tkurtbond.github.io
