Hi Chet, At 2023-10-11T10:22:44-0400, Chet Ramey wrote: > On 10/11/23 5:08 AM, G. Branden Robinson wrote: > > Please consider reverting the following recent changes to the bash > > man page. Bjarni should have run them by the groff list first, > > because some of them are ill-considered. > > OK. I'm trying to understand them myself; please take my comments in > that spirit.
No worries. My concern with some of the changes is that they risk mystifying people who encounter them ("`kern`? `ss`? `lg`? What are those?") without delivering concrete benefit, typographical or otherwise. groff_man_style(7), as of groff 1.23.0, attempts to document all of the *roff syntax a man page author is ever likely to need, and strives _not_ to introduce any other *roff features or typesetting concepts.[0] > > +.\" suggested by Bjarni Ingi Gislason <bjarn...@simnet.is> > > +.if n \{\ > > +.kern 0 > > +.ss 12 0 > > +.\} > > > > The above change is half pointless and half intrusive. > > > > A) No formatter for terminal output devices ("nroff mode", which is > > tested by "if n" performs kerning. So that's a no-op. > > > > B) The amount of intersentence spacing, for man pages, is matter of > > the _reader's_ taste and should be left to them. mandoc(1) > > ignores this request and I'm glad it does. So that, too, is a > > no-op with that formatter. > > Is his intent here to force French spacing instead of English spacing? Yes, if you understand "French spacing" to mean "the space between sentences is the same as the space between words". Frustratingly, "French spacing" has multiple incompatible meanings.[1] > How does groff deal with input where the number of spaces after a > period varies? roff(7) and the groff Texinfo manual cover this--clearly, I hope. If not, blame me because the language is mine, and I'll try to improve it. (groff 1.23.0; UTF-8 follows) A roff formatter attempts to detect boundaries between sentences, and supplies additional inter‐sentence space between them. It flags certain characters (normally “!”, “?”, and “.”) as potentially ending a sentence. When the formatter encounters one of these end‐of‐sentence characters at the end of an input line, or one of them is followed by two (unescaped) spaces on the same input line, it appends an inter‐word space followed by an inter‐ sentence space in the output. The dummy character escape sequence \& can be used after an end‐of‐sentence character to defeat end‐of‐sentence detection on a per‐instance basis. Normally, the occurrence of a visible non‐end‐of‐sentence character (as opposed to a space or tab) immediately after an end‐of‐sentence character cancels detection of the end of a sentence. However, several characters are treated transparently after the occurrence of an end‐of‐sentence character. That is, a roff does not cancel end‐of‐sentence detection when it processes them. This is because such characters are often used as footnote markers or to close quotations and parentheticals. The default set is ", ', ), ], *, \[dg], \[dd], \[rq], and \[cq]. The last four are examples of special characters, escape sequences whose purpose is to obtain glyphs that are not easily typed at the keyboard, or which have special meaning to the formatter (like \). That reads a bit better with font style changes, so "man 7 roff" might be preferable. > My personal writing style has changed from two spaces to one over a > number of years, and the man page reflects that. For _input_, it's a good idea to either break lines at the ends of sentences, or put two spaces after them. This is so that the formatter knows where the ends of the sentences are. Like TeX, *roff is not smart to know where the sentence boundary/ies are in "C. A. R. Hoare next came to the U.S. Linux kernel developers have yet to absorb his lessons." For output, the amount of inter-sentence space is configurable; that is what the `ss` request does.[2] For man pages, I strongly urge all authors to leave the issue alone so as to respect readers' preferences. Since authors' will differ, this is the only way to achieve consistency.[3] People can get pretty passionate about this, and complain of their eyeballs being violated when the "wrong" amount of inter-sentence space is employed in a document they're reading. Some people bring this passion even to man page _source_ documents, and the only recourse in that event is to break input lines at the ends of sentences. This has also been Brian Kernighan's advice to troff users since the 1970s.[4] Linux man-pages maintainer Alejandro Colomar calls this practice "semantic newlines". My opinion is that it is a Solomonic solution, satisfying neither partisan camp, but also has a benefit of reducing the amount of churn in diffs. Incremental changes to documentation often find boundaries at sentences. > > This change is pointless because no ligatures are defined for any of > > the letter pairs in the text in any known formatter (the ligature > > for "ct", like that for "st" [not seen here] is archaic in English > > typography and seldom seen in digital fonts). > > I assume he was interested in what formatters do with the `fi'. I > couldn't see any discernable difference myself. Right. It will make no difference (1) when formatting for terminals; (2) when formatting for a typesetter that doesn't support ligatures, or when a font lacking them is used (Courier is a good example); or (3) when copy-and-pasting from PDF to a shell prompt. PDF has a feature-- which groff's gropdf(1) exercises--called "CMap" that decomposes ligatures to their constituent letters when copied to the system clipboard or other selection buffer. (I assume the feature exists for exactly this reason.) Thus, in my opinion, that change was a lot of rigmarole for nothing. > > Authorities differ on whether space should surround em dashes; from > > what I have seen, a majority favor omitting them, and that is what I > > do in the groff man pages, but I cannot say it is more than a matter > > of taste. > > I think it's cleaner with spaces, but it's clearly personal taste. Sure. Closing up spaces around em-dashes isn't quite _my_ preference, either, in part because it can make *roff input a little uglier in some edge cases (usually involving font alternation macros), prompting use of the much-feared and mysterious `\c` escape sequence (or simple resignation to subpar formatting, with the usual follow-up threats to switch to Docbook or Markdown or whatever). It has now been years since `\c` surprised me. I _think_ I have it documented adequately in groff 1.23.0. I trust that someone will tell me if I don't. When I last surveyed the issue, the balance of authorities seemed to disfavor spacing around em-dashes, I wanted consistency in the groff man pages, and there were too many other issues where I sought revision or reform and had more appetite for argument. I'm an incorrigible windmill-tilter, but I only have so many lances, you see... Regards, Branden [0] I do perceive some gaps, like the absence of macros for "keeps" and for quotation; the latter is unreasonably hard to achieve attractively and portably at the same time. Maybe some of these gaps will be filled in groff 1.24. In compensation, the `SB` macro, a Sun extension that many people seem to believe came from Bell Labs, is now documented as deprecated in groff Git; I believe I solved the mystery of its origin and motivation. It is unnecessary in modern implementations. [1] The following may provoke laughter, a headache, or both. https://en.wikipedia.org/wiki/History_of_sentence_spacing#French_and_English_spacing [2] The "Files" section of groff_man_style(7) illustrates how to tune this and other subjective parameters that man page authors misguidedly attempt to impose on others via their documents. Note also the "Options" section. [3] Not exactly true. mandoc(1) takes a Henry Ford approach; you get the adjustment and hyphenation modes, inter-sentence spacing, and so forth that its maintainer thinks you should see, ignoring requests that would alter them. If you don't like those defaults, tough![5] (This decision is understandable in context.) [4] https://rhodesmill.org/brandon/2012/one-sentence-per-line/ [5] https://www.dourish.com/goodies/see-figure-1.html
signature.asc
Description: PGP signature