Hi Russ, At 2022-12-23T10:03:13-0800, Russ Allbery wrote: > "G. Branden Robinson" <g.branden.robin...@gmail.com> writes: > > That's fair, and it isn't the first time I've heard capable people > > express the opinion that having a document translator produce > > idiomatic man(7) font alternation macro calls rather than chains of > > font selection escape sequences was Just Too Damned Hard. If I > > could show people how to do it, I might do so with a swagger, but I > > confess I can't cash that check at present. > > Yeah, the difficulty lies mostly in the layering, because people can > write POD source that is nonsensical in a man page context but that I > still have to do something with. Stuff like > C<<< B<< L<foo(1)> >> >>>.
The *roff language does not maintain a stack of typeface changes. How radical a change to POD would it be to reject constructions like the above? > It makes no sense to make the man page reference, which one could > otherwise nicely represent as: > > .BR foo (1) Right. > also bold and fixed-width, but if that's what someone wrote in the POD > source, I have to do *something* with it. And that means either > trying to analyze global state or having to parse the *roff that I > output in an earlier stage. Not a fate I would wish on you. > > Here, I know your pain. I took it upon myself to document this shit. > > Thanks for this, I should have thought to look at the groff manual > about it. The groff 1.22.4 and earlier docs covered this subject, but not in the detail I quoted. That's a recent rewrite for the forthcoming groff 1.23.0. I have at some point seen a pithy, one-sentence description of the macro quotation rules that I _think_ covered all of the cases in my own lengthy presentation, but to correctly parse it, the reader would have to engage maximum standards-lawyer brain. I felt that a slower-paced presentation, with examples, was a better approach. No \(dq, no peace; know \(dq, know peace. And it's not even a groffism. > That corrected a few of my misconceptions about macro arguments. > (It's very easy for this stuff to all become cargo-cult. Oh yes indeed. I've plowed over some of groff's own man pages of their ersatz airstrips. > I refer to CSTR 54 all the time, but of course that's limited in its > detail.) Some people say that document is all you need to decide any question at any level of detail, and assemble pyres for the burning of witches who catalog its errata. Unix is like Catholicism. Every aspect has a patron saint with a devotional cult. > > I sure hope the reason this was done the way it was because any more > > accessible approach ran the PDP-11 out of memory. Murray Hill's > > agonizingly slow adoption of 'aq' and 'dq' special character > > identifiers I find difficult to explain given that they bought and > > paid for a font that included these glyphs on their very first > > typesetting device. I should clarify this.[1] > Yes, it's frustrating that one can't portably just use the special > character escapes everywhere. You just about could, if the maintainers of descendants of device-independent troffs would spend less than five minutes of effort. But they won't. They are the natural partners of the "sola scriptura" party I mentioned above. > The additional problem that Pod::Man has is that I want to add double > quotes around literal text if and only if I'm rendering with nroff. > With troff, the font change is sufficient and I don't want to add > quotes. A lot of man pages use bold for literals, even on terminal devices. I tend to in groff's own pages, but I _also_ quote multi-word or potentially ambiguous literals in case the man page is viewed in a context that strips the typeface (like copying and pasting into an email). > The simplest way to do that normally is with a string that's > defined to either the empty string or the quote mark depending on > whether rendering is with nroff or troff, but this causes no end of > hassles when it's inside macro arguments, not to mention the need to > work around Solaris bugs with font changes. mdoc(7) "solved" this with a bespoke recursive approach that interpreted macro arguments as macro names and called them. I recently proposed adding a `Q` macro to a future groff man(7), but (1) it is meant only for simple/common cases since the problem I perceive is that man page writers struggle to use quotation at all and (2) since it would be a groff feature, Pod::Man either can't use it or would need to define its own fallback. I can't think of a way to cut this knot in a Solaris troff-compatible way. The combination of the "what was my previous font again?" bug and refusal to define a special character for "dq" may make it intractable. But the reinforces the point that the problem with Solaris troff is not that it is inherently incapable, but that it is frozen and unmaintained. If someone red-teamed it and found a half dozen security vulnerabilities, what could we expect Oracle to do about it? > I'm fairly sure there's some better way of handling this than what I'm > currently doing, but my brain has not managed to come up with it yet. Maybe we can put our heads together on this when 1.23.0 is behind me. > > Whither this antipathy for the neutral apostrophe? > > This has been an interesting long-term struggle. It was the GNU > coding style for years to use `' as matched quotes. I think they've > finally switched to Unicode quotes instead. Sort of. I'd say more that it finally acknowledged the existence of ISO 8859 (free ECMA-94 copy here[2]). So at long last they advise people to simply use ' and ", each paired with themselves.[3] > Technically, of course, the English apostrophe isn't neutral; it's > curved to the left. Right. In formal typography that is true. The idea in *roff is, and as I understand it always has been, to express the glyph you _mean_, and the output device will do its best to honor your requirements. So, in a roff document, when people type "can't", they want whatever constitutes a typographical apostrophe in the output. When they say char c = \[aq]\[rs]\[aq]\[aq]; they want the characters that the C language definition identifies as having special meaning. > But the ASCII character is used and abused for a bunch of different > things that aren't really apostrophes. Yes. It has a been a painful process for ISO 8859 and then Unicode to get people to think more abstractly about the "characters" that they mean to write instead of the "glyphs" that appear before them in their own composition environment while carrying an assumption that it is the programming system's responsibility to do what they mean, and make anyone who reads their output see the same thing. It has been a decades-long process to pull people up from a point-and-grunt mentality of typography, and there is still a way to go. > > With the last proprietary Unixes finally retiring to their coffins > > or at least throwing in the towel on any delusions of troff > > maintenance, maybe people will take up some of these conveniences at > > last. > > Speaking as someone maintaining a generator, it's very difficult to > know when I can drop support for old Unixes. It's also very painful > to be wrong; if I delete a bunch of compatibility code, and then later > someone really wants it back, adding it back in is awful. Does that mean you're not hopeful that you will be dropping support for Solaris troff soon after Oracle does? I learned the following from Paul Eggert on this list just last month.[4] PE> Solaris 10 is no longer supported after January 2024, so if it and PE> all the other traditional troffs die out by 2024 we can stop PE> worrying about this then. PE> PE> Solaris 11.4, the only Oracle Solaris version that is planned to be PE> supported after January 2024, is based on groff 1.22.3 instead of on PE> traditional troff. See: PE> PE> https://docs.oracle.com/cd/E88353_01/html/E37839/troff-1.html PE> https://www.illumos.org/issues/12692 This could buy you a lot of elbow room. (groff 1.22.3 is 8 years old, but...one dose of Geritol at a time.) Regards, Branden [1] As we can see from the 1976 edition of CSTR #54,[5] the C/A/T's "ASCII apostrophe" was not a "neutral apostrophe" as groff documentation today describes it; it was mirror symmetric with the grave accent and so what you probably do is alias it with \(aa. No roff can guarantee what the glyphs formatted by a typesetter or terminal will _look like_. That is why we more properly call the "R", "B", and "S" files _font descriptions_. As I explain in groff_char(7), ASCII outright encouraged the semantic ambiguity of some code points. It was not until ISO 8859 that code point 39 acquired unambiguous "neutral" semantics. [2] https://www.ecma-international.org/wp-content/uploads/ECMA-94_2nd_edition_june_1986.pdf [3] https://www.gnu.org/prep/standards/standards.html#Quote-Characters [4] https://lists.gnu.org/archive/html/groff/2022-11/msg00179.html [5] https://www.dropbox.com/s/qpk9id0b3w5hu5g/CSTR_54_1976.pdf?dl=0 (that URL might not work forever)
signature.asc
Description: PGP signature