Hi Ian,

At 2023-07-17T14:32:40+0100, Ian Jackson wrote:
> Hi.  Thanks for your comprehensive, detailed and helpful email.

You're most welcome.

> G. Branden Robinson writes ("Bug#1041317: dgit: table too wide in man
> page, trashes autopkgtests"):
> > I noticed that groff 1.23.0-2 (and -1 before it) will be blocked
> > from ever migrating to testing due to an autopkgtest failure on
> > EVERY architecture.
> 
> I'm sorry that this test is causing you an inconvenience.  Thanks for
> filing the bug.  I think the right severity for a bug which is
> preventing another package from migrating is serious, so I am raising
> the severity of your report.

Okay.

> > Here are the juicy bits.
> ...
> > 941s RE (?^:^(?:(?^:^(?=a)b))$)|(?^:^(?:ERROR.*)$)|(?^:^(?:.* # table wider 
> > than line width)$)
> ...
> > That regex is sufficiently complex that I can't tell if it's trying
> > to filter the diagnostic message "table wider than line width" or
> > not.  If it is, the fact that I have recast the language of the
> > diagnostic message in groff 1.23.0 has fooled it.
> 
> Yes, that is what it is doing.  Unfortunately as far as I'm aware,
> groff doesn't offer a formal warning suppression or classification
> mechanism.

It does--see troff(1), section "Warnings"--but not one that would help
us here.

You may know that tbl(1) is a troff(1) preprocessor; it reads a piece of
an input document describing a table and turns that into roff(7)
language input that lays out the table.

One of the things it also does is diagnose problems with tables, like
invalid characters in table format descriptions (like "lb l.").  Some
table validity problems, however, are not known at the time the
preprocessor runs, but only at formatting time--that is, when troff
itself runs.  It is only then that the dimensions of the output medium
are known.  tbl therefore embeds roff control structures (like the `if`
request) and requests to write to the standard error stream (like `tm`;
mnemonic: "terminal message") when certain conditions obtain.

There is no organizational scheme for such messages.  To impose one
would require changes to the roff language, or the creation of a
diagnostics-oriented macro package upon which tbl preprocessor output
would depend.  Even then, tbl can't know at the time it runs whether the
macro package will be loaded when the formatter runs.

I therefore don't have high hopes for resolving this.  It would be
better, probably, just to introduce opaque tokens (like "TBLW1" or
similar) to enable people to apply the regex-matching approach with
greater reliability.  GNU tbl has featured only four such messages in
its history, to the best of my knowledge.

However, if I improve GNU tbl to the point that it doesn't ever throw
diagnostics that are false positives, people won't be as tempted to
scrape them away with regexes...

> So instead there is this regexp, which (as might be expected of such a
> thing) is fragile and keeps growing cases as messages change, becoming
> ever more baroque and inscrutable.

Yeah.  :(

> > So let's the fix the problem in the dgit man page.
> > 
> > Man pages are formatted for a width of 78n if the terminal width is
> > 80 columns.  This origin of this practice is not well documented but
> > experience with groff upstream leads me to believe that it is a
> > workaround for bugs in GNU tbl(1).  (In AT&T Unix Version 7, they
> > were formatted for a line length of 65n, with a page offset of one
> > tenth of an inch.  On Western Electric Teletype Model 37 printing
> > terminals.)
> > 
> > How wide is dgit's man page?
> ...
> > 79
> ...
> > Hence the warning.
> > 
> > groff 1.22.4 didn't used to throw this warning in this circumstance.
> 
> I'm pretty sure at least one version of groff I encountered produced
> this warning,

That would not surprise me; while I have some command of GNU tbl's
history, I haven't mastered it.

> > The diagnostic is wholly legitimate.
> > 
> > Let's have a look:
> > 
> > $ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii dgit | sed -n 
> > '/DESCRIPTION/,/OPERATIONS/p'
> > <standard input>:46: warning: table wider than line length minus indentation
> > DESCRIPTION
> ...
> >        dgit-maint-gbp(7)         for maintainers already using 
> > git-buildpackage
> ...
> > Yup, if we look carefully, we can see that the word "git-buildpackage"
> > encroaches into the right margin.
> 
> This is true in a strictly technical sense: the generated formatting
> does violate groff's the intended behaviour.  However, the output is,
> in fact, perfectly fine, when looked at from a user's point of view:
> the output line is 79 characters long, which as you note doesn't in
> fact currently cause trouble (other than this warning).

Yes, but until I'm convinced that GNU tbl's test suite (new to groff
1.23.0) is comprehensive in this respect, I can't be sure that a table
rendered in nroff mode will only ever overrun by 1 character cell.  I
_think_ that's where we are with 1.23, but not confident enough to start
telling people (or man(1) maintainers) to throw over the informal-but-
widely-respected 78n limit in favor of a 79n one (or, more generally,
the practice of telling troff to format for 2 fewer character cells than
describe the terminal's actual width).  I'd rather fix the devilish bug
mentioned previously[1] and tell them they can use the full width.

There were several problems with GNU tbl in nroff mode which made
attempts to lay out any but the most featureless tables by counting
character cells unreliable and frustrating to the user.  Except for El
Diablo, I believe I have fixed these in groff 1.23.

> So I felt justified in ignoring the warning.  But, I didn't want to
> ignore all warnings since it is so easy to introduce syntax errors
> etc.  Hence the fragile regexp.
> 
> I think that the best fix would be this:
> 
> > I would like in the future to improve GNU tbl to the point where
> > tables can spread their wings to the full 80 column span of the
> > widely accepted minimum terminal width, but a deeply ingrained
> > feature of GNU tbl makes that tough.
> > 
> > Maybe it will happen for groff 1.24.
> 
> I quite believe you that it's difficult.

I have ideas.  That I anticipate some synergy with modern xterm's
support for DEC VT340 ReGIS graphics holds promise (you're now the first
to hear of this wicked plan), but I aim to solve the problem for very
dumb character-cell terminals, too.  At the cost of
double-line/double-box support in such terminals, which at present is
implemented only poorly.

https://savannah.gnu.org/bugs/?43636
https://savannah.gnu.org/bugs/?43637

> > My recommendation is a simple tweak to the table format, stealing
> > one en of column separation to make the table fit.
> 
> This sounds like a very reasonable workaround to me.  I'll take a look
> at the generated output and (assuming I'm happy with it) I'll adopt
> your suggestion.
> 
> > $ diff -u ./dgit.1.orig ./dgit.1
> ...
> > -lb l.
> > +lb2 l.

I forgot to illustrate how I validated my fix.

$ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii ./dgit.1.orig 2>&1 
>/dev/null | grep . || echo no diagnostics
<standard input>:46: warning: table wider than line length minus indentation
$ MANPAGER=cat MANWIDTH=80 command man --warnings -Tascii ./dgit.1 2>&1 
>/dev/null | grep . || echo no diagnostics
no diagnostics

> I expect to make an upload within a few days, maybe a week.  I hope
> that's satisfactory.

Should be fine.  That's a pretty small fraction of the interval between
the groff 1.22.4 and 1.23.0 releases.  ;-)

Thanks!

Regards,
Branden

> If I emailed you from @fyvzl.net or @evade.org.uk, that is a private
> address which bypasses my fierce spamfilter.

_Now_ you tell me... :P

[1] https://savannah.gnu.org/bugs/?62471

Attachment: signature.asc
Description: PGP signature

Reply via email to