Hi Branden, On 12/5/22 15:06, G. Branden Robinson wrote: [...]
You're welcome, but I think we might have talked past each other below.Sure, I try to do it consistently. If I Cc you is a "just read it if you want, not forced, maybe you're busy and someone else on groff@ picks it up". :)Works for me. :)what's going on here[the problem that Helge reported]is actually a GNU tbl(1) bug. https://savannah.gnu.org/bugs/?61909I think I'll keep this as a WONTFIX. The man-pages don't have stable releases (i.e., what you get at the time your distro releases is what you'll get forever), so stable users will have this bug unfixed forever until they dist-upgrade, even if I fixed it. Soon (we hope), groff 1.23.0 will be released, so next OS releases (e.g., Bookworm) won't have this bug (and many others that you fixed). So, the only problem is for those who use stable distros, but somehow install the fresh man-pages.No, that is not the case. Because there _aren't_ dummy characters \& after the sentence ending punctuators [!?.] that are followed by multiple space characters in the ascii(7) page today, _and_ every known released version of GNU tbl incorrectly applies the configured inter-sentence space to the second space character after such punctuators, people are getting incorrect output _now_ from this table, and any others that regex-match "[.!?] " in ordinary text blocks if their configured inter-sentence space amount is not the default.
My point was that: - groff 1.23.0 will (hopefully) be available since Bookworm, a few months from now.- man-pages 6.02 will be available exactly at the same time to end users (okay, in other distros that might differ by a few months, but more or less, I expect to be available at the same time, more or less.
Since there will only be a small window from when I release to when you release, workarounding it in the man-pages will effectively be useful to 0 Debian users, and very few users of other distros. I can live with it.
In fact, I'd could just wait to release 6.02 after groff, since it would still be on time for Bookworm, and that would allow for more changes into that release. However:
- I'd like to give some extra time for translators to work before the freeze, and - The solstice is a nice day for a release. I prefer it over some random day :)
That last condition is in fact common for non-Anglophone users of groff. Let me show you a simple exhibit and then I'll drown you with more background. ---snip--- $ cat EXPERIMENTS/iss.man .TH foo 1 2022-12-05 "groff test suite" .SH Name foo \- frobnicate a bar .SH Description .TS L. Foo. Bar. .TE .ss 12 0 .TS L. Baz. Qux. .TE .TS L. Hep.\& Sid. .TE $ nroff -t -man EXPERIMENTS/iss.man # groff 1.22.4 (Debian) foo(1) General Commands Manual foo(1) Name foo - frobnicate a bar Description Foo. Bar. Baz. Qux. Hep. Sid.
Yep, I think I can live with that bug for half a year.
groff test suite 2022‐12‐05 foo(1) $ ./build/test-groff -t -man -Tascii EXPERIMENTS/iss.man # groff Git foo(1) General Commands Manual foo(1) Name foo - frobnicate a bar Description Foo. Bar. Baz. Qux. Hep. Sid. groff test suite 2022-12-05 foo(1) ---snip--- So, a table entry _lacking_ these dummy character escape sequences \& is exposed to the old groff bug, which still exists in the wild on every system until last week, I suppose. (This bug is not man(7)-specific. It will affect any groff document regardless of macro package.) Lengthy background ================== It can be seen that the difference in output was prompted by this line. .ss 12 0 The formatter's default is equivalent to this. .ss 12 12 The function of the number "12" is not obvious here; it arises from traditions of mechanical typography. But what it _means_ is, "put one word space between each word and put one (additional) word space between sentences on the same output line". Yeah, but nobody should be manipulating the inter-sentence spacing in a man page, right? Right. But, localization files... $ git grep 'ss 12 0' tmac tmac/cs.tmac:.ss 12 0 tmac/de.tmac:.ss 12 0 tmac/fr.tmac:.ss 12 0 tmac/groff_man.7.man.in:\&.ss 12 0 \e" See groff(@MAN7EXT@). tmac/it.tmac:.ss 12 0 tmac/sv.tmac:.ss 12 0 Not to mention the fact that this request could appear in a troffrc or man.local file. In short, this is a user-configurable parameter and a portable man page should not assume the inter-sentence spacing amount. \& works to hide the bug even on old (well, current :-/ ) GNU tbl because it suppresses the detection of sentence endings altogether. \& does have other semantics in tbl(1) tables; it is used to align the units place in columns using a numeric format (classifier "N" rather than "L" or "C", for instance), but I've never in my life seen that format used in a man page. (It is also hard to grep for without gagging on false positives.) But, in principle, telling people just to work around the bug by adding \& in _all_ circumstances is a bad idea for this reason.[1] There's a lot of bloody history around inter-sentence spacing, enough that we have to cover the subject in the groff Texinfo manual,[2] and it
Hmm, I think I'll refer to that link more than once.
is compounded by luminaries like the general editor of the Chicago Manual of Style lying to the public about that history. groff maintains compatibility with AT&T troff in this area. In Europe, supplemental inter-sentence space is _not_ common, and I gather there is some kind of official European Union style guide that militates against it. It is binding only upon official EU publications, but many organizations have adopted it nonetheless--it saves the expense of maintaining a style guide of one's own, and plenty of people in the U.K. who voted for and celebrate BrExit nevertheless slavishly follow EU prescriptions in this area.That can be random people that install random packages from source, or contributors to the pages. For both of them, I specify the dependencies in the INSTALL file, so I hope they don't blame me too much; they should ask their distributor about backporting groff 1.23.0 for installing the pages from source, or install groff from source, or be happy with small glitches like this :)I understand if you don't want to mess with a belt-and-suspenders approach, but I want to make sure you're making an informed decision. :)However, things like .MR concern me more.Me too. I'm trying to contain my expectations because history is replete with nice new features that suffered deaths of neglect.
You already have a future user here. I don't think it will die ;)
(warning: inside baseball^W^Wgroff internals) Right now even email and web URLs in man pages aren't hyperlinked in PDF, and that's silly. So I'm trying to orthogonalize man(7) hyperlink support so I can couple it to gropdf(1)'s "pdfmark" support. Or I would be working on it, if the under-documented "pdfhref" macro weren't structured to make it a pain in this ass. I guess whoever designed that didn't expect someone to format link text in a diversion. Also I discovered an exciting new (old) bug when formatting HTML. :( Anyway, once that is done, I can integrate Deri James's cool trick for converting "local" man page cross references into PDF bookmarks, so you do something like, hypothetically,[3] produce a 380-page compilation of 60 man(7) and mdoc(7) documents that have hyperlinked cross-references to each other, and present "man:blah(1)" hyperlinks for pages outside that collection.
Hmm, this reminds me I also want to do that single PDF for the Linux man-pages. I'll ask you again about it when I have less stuff in queue. Right now I'm busy rewriting documentation about strings, killing some of them, and documenting when/where others should be used. I'm writing a new string(7), which should be a nice guide on which string function to use for your case.
I might fail at orthogonalizing, but I'll do my damnedest to at least get this _working_. ("groff 1.24: the same but with elegance"... :-| )
:)
I'd be happy doing some radical changes and requiring 1.23.0 as a bare minimum, and use MR right after the Bookworm release.[insert Kang and Kodos clip]
Cheers, Alex
Hopefully that triggers backporting of groff; maybe you can do that as a future maintainer of the Debian package? :PMaybe, if groff 1.23 proves not to have many surprising regressions, that would be feasible, but I would prefer to delegate that sort of task. Build a team wherever you can. A backport is more likely to happen if groff 1.23 proves not to have many regressions from 1.22.4. I've gone to considerable lengths to avoid that: I have automated test #152 in my working copy now. (groff 1.22.4 had three.)[1] (groff insider stuff)The parentheses in here help a lot with long messages :)I fear "tl;dr" was coined around 1999 by people exposed to my emails. Regards, Branden [1] tbl uses the _leftmost_ `\&` in a numerically formatted entry as the alignment position. For instance, imagine a business that produced formatted reports by accepting text input from a terminal^Wweb form. Also assume that the report generator wasn't too fastidious about tidying up that input. .\" nroff -t | cat -s .TS tab(@); C S C S L N. Amy's Kennels Boarded Animals, Week of 2022-12-05 Size@Name and check-in weight (kg) Large@Max 25.6 \^@Sassy. 44.8 Small@Henrietta 6.24 \^@T. J. Peepers.\& (chinchilla) 3.03 .TE This is not a _well_-designed table, but it is a _plausible_ one. Well, almost.[4] But adding another \& later at the "real" position where the decimal point should be aligned will not help, because the leftmost one controls. [2] https://git.savannah.gnu.org/cgit/groff.git/tree/doc/groff.texi?id=aa20f5961cb0788e888180c57add5a452ce9d8d6#n4976 [3] https://git.savannah.gnu.org/cgit/groff.git/tree/doc/doc.am?id=aa20f5961cb0788e888180c57add5a452ce9d8d6#n257 [4] I'd like to meet the web-form-using kennel service staffer who knew to sneak *roff escape sequences into the input. But we all know that failure to validate input is as common as street litter.
-- <http://www.alejandro-colomar.es/>
OpenPGP_signature
Description: OpenPGP digital signature