At 2022-07-10T16:55:41-0500, Dave Kemper wrote: > Why are tmac/tty-char.tmac and tmac/tty.tmac separate files? They > both seem to serve essentially the same purpose: providing fallback > character sequences for characters that may not exist in the terminal > environment's encoding (especially the limited sets available to ASCII > and Latin-1).
I can't speak to this comprehensively since I'm still the new guy, but I can distinguish them in a few ways. First of all, there is always a macro file named for the output driver that gets run via troffrc. (In fact, that's what most of the lines in the shipping troffrc are concerned with. There's some indirection involved to support rendering with gxditview.) So there's a ps.tmac, pdf.tmac, html.tmac, and so on. tty.tmac does a few things that _aren't_ character/glyph definition-related. It sets the page offset to zero, which makes sense for video terminals and their emulators. It defines some color names, which other output device macro files also do Finally, it conditionally loads some macro files particular to the output character encoding. Those in turn deal with the translation of input character code points. (I find these facts to be in some tension, and it may be part of what Werner flagged years ago as an undesirable coupling between character encodings and output devices.) > But they are called from different places: grotty's man page specifies > that troffrc loads tty.tmac automatically for any terminal device; > thus, these two commands, using a character defined in tty.tmac, both > produce the same output: > > $ echo '\[lh]' | nroff | cat -s > $ echo '\[lh]' | groff -Tascii | cat -s > > But, as its man page says, the nroff script specifies tty-char.tmac, > which means that whether this file is loaded depends on which command > is used. This example uses a character defined in tty-char.tmac: > > $ echo '\[dg]' | nroff | cat -s > <*> > > $ echo '\[dg]' | groff -Tascii | cat -s > troff: <standard input>:1: warning: can't find special character 'dg' > > $ > > There may be a valid reason for this separation, but I'm not sure what > it is. What are the use cases where a user wants one set of > definitions but not the other? > > A comment in tty-char.tmac (reiterated on the nroff man page) gives > some small hint: "the optical appearance of the definitions contained > in this file is inferior... to those of the replacement characters > defined in the file tty.tmac." But this must be a statement in > general terms, as only one character is defined in both files: > > $ egrep -h '^\.(f|tty-)char ' tmac/tty*.tmac | cut -f2 -d\ > /tmp/sym > $ diff <(sort /tmp/sym) <(sort -u /tmp/sym) > 232d231 > < \[sd] > $ rm /tmp/sym > $ fgrep '\[sd]' tmac/tty* > tmac/tty-char.tmac:.tty-char \[sd] '' > tmac/tty.tmac:.fchar \[sd] \[dq] > > (And in this case, which fallback definition of the arc-second sign is > "superior" is a judgment call, but whichever one it is ought to be > used across the board. Both are pure ASCII; \[dq] is ASCII 34, the > straight double-quote character.) I think that definition in tty-char.tmac should be dropped. I added fallbacks in tty.char for \[fm] and \[sd] (both CSTR #54 glyphs) in May 2021. I seem to remember that Ingo followed suit at least for the latter in mdoc. > So the upshot isn't that using only tty.tmac gives "better" results; > the upshot is that using only tty.tmac leaves some character escapes > undefined for terminals, provoking warnings and omitting information > from the output. > > And in a couple of cases, tty.tmac (the file that's always loaded) > seems to presume the loading of tty-char.tmac (the file that's NOT > always loaded): tty.tmac defines two glyphs in terms of \[rn], but > \[rn] is defined for terminal devices in tty-char.tmac. What I was trying to do was shut up warnings from the groff_char(7) man page. We simply can't be sure (without testing "\*[.T]") in tty.tmac that we're going to have an output device that can do anything approximating an extension for a radical sign. Its most useful application requires overstriking capability, which we don't actually expect any of our terminal output devices to have. Nevertheless rendering the character in isolation is possible on every such device _except_ ascii. > Does these files being separate serve some purpose, or is it an > evolutionary accident? If I tilt my head far enough to one side I can imagine a world where you use tty.char when you're trying to test a document in some kind of device-independent way, and you use tty-char.tmac when you need "practical" rendering. But for that scenario to hold, practically all fallback character definitions need to move from tty.tmac to tty-char tmac. Doing that would also be in tension with that I think was a historical effort (in groff) to characterize "nroff" as being training wheels for AT&T *roff refugees. The idea was, I think to steer everyone to the groff(1) front end for one-stop shopping. Perhaps the notion was to someday deprecate nroff in favor of groff's grand unification of data flow. And man-db(1) man seems to honor that intention. However, adding features like output character encoding detection to nroff(1) stymied that objective, and nroff's unconditional loading of tty-char.tmac compounded the...stymieing. I personally find groff's nroff command to be useful and a keystroke-saver. There also remains the matter of "fallbacks.tmac", which at least documents its purpose as "generic", by which I think we are to assume output-driver independent. So that's a lot of words to tell you that I don't have a good answer. The comments at the tops of fallbacks.tmac and tty-char.tmac do appear to articulate coherent objectives, but I'm not sure we (and I include myself in that pronoun since I've mucked about with both) have hewed to them as reliably as we should. Colin Watson's feedback as man-db maintainer might be helpful here. Regards, Branden
signature.asc
Description: PGP signature