Hi Ian, At 2023-07-17T14:32:40+0100, Ian Jackson wrote: > Hi. Thanks for your comprehensive, detailed and helpful email.
You're most welcome. > G. Branden Robinson writes ("Bug#1041317: dgit: table too wide in man > page, trashes autopkgtests"): > > I noticed that groff 1.23.0-2 (and -1 before it) will be blocked > > from ever migrating to testing due to an autopkgtest failure on > > EVERY architecture. > > I'm sorry that this test is causing you an inconvenience. Thanks for > filing the bug. I think the right severity for a bug which is > preventing another package from migrating is serious, so I am raising > the severity of your report. Okay. > > Here are the juicy bits. > ... > > 941s RE (?^:^(?:(?^:^(?=a)b))$)|(?^:^(?:ERROR.*)$)|(?^:^(?:.* # table wider > > than line width)$) > ... > > That regex is sufficiently complex that I can't tell if it's trying > > to filter the diagnostic message "table wider than line width" or > > not. If it is, the fact that I have recast the language of the > > diagnostic message in groff 1.23.0 has fooled it. > > Yes, that is what it is doing. Unfortunately as far as I'm aware, > groff doesn't offer a formal warning suppression or classification > mechanism. It does--see troff(1), section "Warnings"--but not one that would help us here. You may know that tbl(1) is a troff(1) preprocessor; it reads a piece of an input document describing a table and turns that into roff(7) language input that lays out the table. One of the things it also does is diagnose problems with tables, like invalid characters in table format descriptions (like "lb l."). Some table validity problems, however, are not known at the time the preprocessor runs, but only at formatting time--that is, when troff itself runs. It is only then that the dimensions of the output medium are known. tbl therefore embeds roff control structures (like the `if` request) and requests to write to the standard error stream (like `tm`; mnemonic: "terminal message") when certain conditions obtain. There is no organizational scheme for such messages. To impose one would require changes to the roff language, or the creation of a diagnostics-oriented macro package upon which tbl preprocessor output would depend. Even then, tbl can't know at the time it runs whether the macro package will be loaded when the formatter runs. I therefore don't have high hopes for resolving this. It would be better, probably, just to introduce opaque tokens (like "TBLW1" or similar) to enable people to apply the regex-matching approach with greater reliability. GNU tbl has featured only four such messages in its history, to the best of my knowledge. However, if I improve GNU tbl to the point that it doesn't ever throw diagnostics that are false positives, people won't be as tempted to scrape them away with regexes... > So instead there is this regexp, which (as might be expected of such a > thing) is fragile and keeps growing cases as messages change, becoming > ever more baroque and inscrutable. Yeah. :( > > So let's the fix the problem in the dgit man page. > > > > Man pages are formatted for a width of 78n if the terminal width is > > 80 columns. This origin of this practice is not well documented but > > experience with groff upstream leads me to believe that it is a > > workaround for bugs in GNU tbl(1). (In AT&T Unix Version 7, they > > were formatted for a line length of 65n, with a page offset of one > > tenth of an inch. On Western Electric Teletype Model 37 printing > > terminals.) > > > > How wide is dgit's man page? > ... > > 79 > ... > > Hence the warning. > > > > groff 1.22.4 didn't used to throw this warning in this circumstance. > > I'm pretty sure at least one version of groff I encountered produced > this warning, That would not surprise me; while I have some command of GNU tbl's history, I haven't mastered it. > > The diagnostic is wholly legitimate. > > > > Let's have a look: > > > > $ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii dgit | sed -n > > '/DESCRIPTION/,/OPERATIONS/p' > > <standard input>:46: warning: table wider than line length minus indentation > > DESCRIPTION > ... > > dgit-maint-gbp(7) for maintainers already using > > git-buildpackage > ... > > Yup, if we look carefully, we can see that the word "git-buildpackage" > > encroaches into the right margin. > > This is true in a strictly technical sense: the generated formatting > does violate groff's the intended behaviour. However, the output is, > in fact, perfectly fine, when looked at from a user's point of view: > the output line is 79 characters long, which as you note doesn't in > fact currently cause trouble (other than this warning). Yes, but until I'm convinced that GNU tbl's test suite (new to groff 1.23.0) is comprehensive in this respect, I can't be sure that a table rendered in nroff mode will only ever overrun by 1 character cell. I _think_ that's where we are with 1.23, but not confident enough to start telling people (or man(1) maintainers) to throw over the informal-but- widely-respected 78n limit in favor of a 79n one (or, more generally, the practice of telling troff to format for 2 fewer character cells than describe the terminal's actual width). I'd rather fix the devilish bug mentioned previously[1] and tell them they can use the full width. There were several problems with GNU tbl in nroff mode which made attempts to lay out any but the most featureless tables by counting character cells unreliable and frustrating to the user. Except for El Diablo, I believe I have fixed these in groff 1.23. > So I felt justified in ignoring the warning. But, I didn't want to > ignore all warnings since it is so easy to introduce syntax errors > etc. Hence the fragile regexp. > > I think that the best fix would be this: > > > I would like in the future to improve GNU tbl to the point where > > tables can spread their wings to the full 80 column span of the > > widely accepted minimum terminal width, but a deeply ingrained > > feature of GNU tbl makes that tough. > > > > Maybe it will happen for groff 1.24. > > I quite believe you that it's difficult. I have ideas. That I anticipate some synergy with modern xterm's support for DEC VT340 ReGIS graphics holds promise (you're now the first to hear of this wicked plan), but I aim to solve the problem for very dumb character-cell terminals, too. At the cost of double-line/double-box support in such terminals, which at present is implemented only poorly. https://savannah.gnu.org/bugs/?43636 https://savannah.gnu.org/bugs/?43637 > > My recommendation is a simple tweak to the table format, stealing > > one en of column separation to make the table fit. > > This sounds like a very reasonable workaround to me. I'll take a look > at the generated output and (assuming I'm happy with it) I'll adopt > your suggestion. > > > $ diff -u ./dgit.1.orig ./dgit.1 > ... > > -lb l. > > +lb2 l. I forgot to illustrate how I validated my fix. $ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii ./dgit.1.orig 2>&1 >/dev/null | grep . || echo no diagnostics <standard input>:46: warning: table wider than line length minus indentation $ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii ./dgit.1 2>&1 >/dev/null | grep . || echo no diagnostics no diagnostics > I expect to make an upload within a few days, maybe a week. I hope > that's satisfactory. Should be fine. That's a pretty small fraction of the interval between the groff 1.22.4 and 1.23.0 releases. ;-) Thanks! Regards, Branden > If I emailed you from @fyvzl.net or @evade.org.uk, that is a private > address which bypasses my fierce spamfilter. _Now_ you tell me... :P [1] https://savannah.gnu.org/bugs/?62471
signature.asc
Description: PGP signature