Hi Alex, At 2022-12-05T13:35:42+0100, Alejandro Colomar wrote: > On 12/5/22 09:15, G. Branden Robinson wrote: > > [[The fix]] would be something like this: > > > > -3: # 3 C S c s 3: ! + 5 ? I S ] g q {\n" > > +3: # 3 C S c s 3: !\& + 5 ?\& I S ] g q {\n" > > -6: & 6 F V f v 6: $ . 8 B L V \\` j t \\(ti\n" > > +6: & 6 F V f v 6: $ .\& 8 B L V \\` j t \\(ti\n" > > Thanks!
You're welcome, but I think we might have talked past each other below. > Sure, I try to do it consistently. If I Cc you is a "just read it if > you want, not forced, maybe you're busy and someone else on groff@ > picks it up". :) Works for me. :) > > what's going on here [the problem that Helge reported] > > is actually a GNU tbl(1) bug. > > > > https://savannah.gnu.org/bugs/?61909 > I think I'll keep this as a WONTFIX. > > The man-pages don't have stable releases (i.e., what you get at the > time your distro releases is what you'll get forever), so stable users > will have this bug unfixed forever until they dist-upgrade, even if I > fixed it. > > Soon (we hope), groff 1.23.0 will be released, so next OS releases > (e.g., Bookworm) won't have this bug (and many others that you fixed). > > So, the only problem is for those who use stable distros, but somehow > install the fresh man-pages. No, that is not the case. Because there _aren't_ dummy characters \& after the sentence ending punctuators [!?.] that are followed by multiple space characters in the ascii(7) page today, _and_ every known released version of GNU tbl incorrectly applies the configured inter-sentence space to the second space character after such punctuators, people are getting incorrect output _now_ from this table, and any others that regex-match "[.!?] " in ordinary text blocks if their configured inter-sentence space amount is not the default. That last condition is in fact common for non-Anglophone users of groff. Let me show you a simple exhibit and then I'll drown you with more background. ---snip--- $ cat EXPERIMENTS/iss.man .TH foo 1 2022-12-05 "groff test suite" .SH Name foo \- frobnicate a bar .SH Description .TS L. Foo. Bar. .TE .ss 12 0 .TS L. Baz. Qux. .TE .TS L. Hep.\& Sid. .TE $ nroff -t -man EXPERIMENTS/iss.man # groff 1.22.4 (Debian) foo(1) General Commands Manual foo(1) Name foo - frobnicate a bar Description Foo. Bar. Baz. Qux. Hep. Sid. groff test suite 2022‐12‐05 foo(1) $ ./build/test-groff -t -man -Tascii EXPERIMENTS/iss.man # groff Git foo(1) General Commands Manual foo(1) Name foo - frobnicate a bar Description Foo. Bar. Baz. Qux. Hep. Sid. groff test suite 2022-12-05 foo(1) ---snip--- So, a table entry _lacking_ these dummy character escape sequences \& is exposed to the old groff bug, which still exists in the wild on every system until last week, I suppose. (This bug is not man(7)-specific. It will affect any groff document regardless of macro package.) Lengthy background ================== It can be seen that the difference in output was prompted by this line. .ss 12 0 The formatter's default is equivalent to this. .ss 12 12 The function of the number "12" is not obvious here; it arises from traditions of mechanical typography. But what it _means_ is, "put one word space between each word and put one (additional) word space between sentences on the same output line". Yeah, but nobody should be manipulating the inter-sentence spacing in a man page, right? Right. But, localization files... $ git grep 'ss 12 0' tmac tmac/cs.tmac:.ss 12 0 tmac/de.tmac:.ss 12 0 tmac/fr.tmac:.ss 12 0 tmac/groff_man.7.man.in:\&.ss 12 0 \e" See groff(@MAN7EXT@). tmac/it.tmac:.ss 12 0 tmac/sv.tmac:.ss 12 0 Not to mention the fact that this request could appear in a troffrc or man.local file. In short, this is a user-configurable parameter and a portable man page should not assume the inter-sentence spacing amount. \& works to hide the bug even on old (well, current :-/ ) GNU tbl because it suppresses the detection of sentence endings altogether. \& does have other semantics in tbl(1) tables; it is used to align the units place in columns using a numeric format (classifier "N" rather than "L" or "C", for instance), but I've never in my life seen that format used in a man page. (It is also hard to grep for without gagging on false positives.) But, in principle, telling people just to work around the bug by adding \& in _all_ circumstances is a bad idea for this reason.[1] There's a lot of bloody history around inter-sentence spacing, enough that we have to cover the subject in the groff Texinfo manual,[2] and it is compounded by luminaries like the general editor of the Chicago Manual of Style lying to the public about that history. groff maintains compatibility with AT&T troff in this area. In Europe, supplemental inter-sentence space is _not_ common, and I gather there is some kind of official European Union style guide that militates against it. It is binding only upon official EU publications, but many organizations have adopted it nonetheless--it saves the expense of maintaining a style guide of one's own, and plenty of people in the U.K. who voted for and celebrate BrExit nevertheless slavishly follow EU prescriptions in this area. > That can be random people that install random packages from source, or > contributors to the pages. For both of them, I specify the > dependencies in the INSTALL file, so I hope they don't blame me too > much; they should ask their distributor about backporting groff 1.23.0 > for installing the pages from source, or install groff from source, or > be happy with small glitches like this :) I understand if you don't want to mess with a belt-and-suspenders approach, but I want to make sure you're making an informed decision. :) > However, things like .MR concern me more. Me too. I'm trying to contain my expectations because history is replete with nice new features that suffered deaths of neglect. (warning: inside baseball^W^Wgroff internals) Right now even email and web URLs in man pages aren't hyperlinked in PDF, and that's silly. So I'm trying to orthogonalize man(7) hyperlink support so I can couple it to gropdf(1)'s "pdfmark" support. Or I would be working on it, if the under-documented "pdfhref" macro weren't structured to make it a pain in this ass. I guess whoever designed that didn't expect someone to format link text in a diversion. Also I discovered an exciting new (old) bug when formatting HTML. :( Anyway, once that is done, I can integrate Deri James's cool trick for converting "local" man page cross references into PDF bookmarks, so you do something like, hypothetically,[3] produce a 380-page compilation of 60 man(7) and mdoc(7) documents that have hyperlinked cross-references to each other, and present "man:blah(1)" hyperlinks for pages outside that collection. I might fail at orthogonalizing, but I'll do my damnedest to at least get this _working_. ("groff 1.24: the same but with elegance"... :-| ) > I'd be happy doing some radical changes and requiring 1.23.0 as a bare > minimum, and use MR right after the Bookworm release. [insert Kang and Kodos clip] > Hopefully that triggers backporting of groff; maybe you can do that as > a future maintainer of the Debian package? :P Maybe, if groff 1.23 proves not to have many surprising regressions, that would be feasible, but I would prefer to delegate that sort of task. Build a team wherever you can. A backport is more likely to happen if groff 1.23 proves not to have many regressions from 1.22.4. I've gone to considerable lengths to avoid that: I have automated test #152 in my working copy now. (groff 1.22.4 had three.) > > [1] (groff insider stuff) > > The parentheses in here help a lot with long messages :) I fear "tl;dr" was coined around 1999 by people exposed to my emails. Regards, Branden [1] tbl uses the _leftmost_ `\&` in a numerically formatted entry as the alignment position. For instance, imagine a business that produced formatted reports by accepting text input from a terminal^Wweb form. Also assume that the report generator wasn't too fastidious about tidying up that input. .\" nroff -t | cat -s .TS tab(@); C S C S L N. Amy's Kennels Boarded Animals, Week of 2022-12-05 Size@Name and check-in weight (kg) Large@Max 25.6 \^@Sassy. 44.8 Small@Henrietta 6.24 \^@T. J. Peepers.\& (chinchilla) 3.03 .TE This is not a _well_-designed table, but it is a _plausible_ one. Well, almost.[4] But adding another \& later at the "real" position where the decimal point should be aligned will not help, because the leftmost one controls. [2] https://git.savannah.gnu.org/cgit/groff.git/tree/doc/groff.texi?id=aa20f5961cb0788e888180c57add5a452ce9d8d6#n4976 [3] https://git.savannah.gnu.org/cgit/groff.git/tree/doc/doc.am?id=aa20f5961cb0788e888180c57add5a452ce9d8d6#n257 [4] I'd like to meet the web-form-using kennel service staffer who knew to sneak *roff escape sequences into the input. But we all know that failure to validate input is as common as street litter.
signature.asc
Description: PGP signature