Hi Ingo, At 2025-10-29T18:28:48+0100, Ingo Schwarze wrote: > G. Branden Robinson wrote on Wed, Oct 29, 2025 at 06:16:04AM -0500: > > > I've root-caused the problem; expect a fix in my next push. > > Thanks for investigating and fixing. > > [...] > > The problem is not that a `.T` string > > exists, but that a `.T` _register_ _also_ exists. > > > > commit 897acc5fcd06ee91127185619d33bd90a22dee6e > [,,,] > > [mdoc]: Fix Savannah #67646 (`.T` as macro arg). > [...] > > mandoc(1) does not exhibit this problem, likely because it > > lacks a *roff formatter that defines both a `.T` register and > > a `.T` string; rather it simulates both of these. > > This last sentence is misleading in two very minor ways: > > 1. In mandoc, the roff *formatters* (roff_term.c, roff_html.c) > do not "simulate" registers or strings in any way; instead, > the roff formatters are (by design) completely unrelated to > both registers and strings, completely unaware that such > concepts even exist.
That's what I meant by "simulates". > 2. In mandoc, the roff *parsers* (roff.c, roff_escape.c) > predefine both the .T string and the .T register (rather than > somehow "simulating" them) because both are user-visible features > of the roff(7) language. Fair. I still don't think of mandoc as implementing a *roff "formatter" because so many features of a CSTR #54 *roff are missing. And when I say "formatter", I refer to the *roff _program_ being used to translate the *roff language into output, so I include lexical analysis, parsing into a tree representation (or in *roff's case, a forest of trees), and "code generation": trout/grout generation or, historically, terminal output "directly". > Both are even documented in the > roff(7) manual distributed with mandoc: > > ESCAPE SEQUENCE REFERENCE > [...] > \*[name] > Interpolate the string with the name. For short names, > there are variants \*c and \*(cc. > One string is predefined on the roff language level: \*(.T > expands to the name of the output device, for example ascii, > utf8, ps, pdf, html, or markdown. > [...] > NUMBER REGISTER REFERENCE > [...] > .T Whether an output device has been selected; mandoc(1) always > returns 1, meaning yes. Yes. In mdoc the macro package's bespoke type system, '1' means "callable macro", which is why we saw the behavior of the `.T` string that we did in groff and 4.4BSD-Lite2 mdocs. > The real point why mandoc does not suffer from this former > mdoc-macroset bug is that to distinguish callable macro names > from plain-text arguments, it does not use roff(7) registers (such > use being an internal implementation detail of mdoc macro sets > that is apparently prone to clashes), Yup. > but instead uses a separate hash table of macro names that cannot > clash with user-visible roff(7) language features. This is possible but harder to do in the *roff language itself (related: the performance misadventure of Savannah #67602). You have C at your disposal so the problem is more easily solved. > Not a big deal because this is only a slightly misleading commit > message, which by definition doesn't hurt users. Please suggest a recast and I can update the "ChangeLog" file entry. > I'm not testing your fix right now, but i'm doing some code review - > not very deep, just mentioning what *immediately* springs to the eye: > > > diff --git a/tmac/doc.tmac b/tmac/doc.tmac > > index 47064f9c6..6f61f45d8 100644 > > --- a/tmac/doc.tmac > > +++ b/tmac/doc.tmac > > @@ -2644,10 +2644,17 @@ .de doc-get-arg-type* > > Does the .doc-get-arg-type internal macro right above suffer from a > similar issue, I too noticed that it was similar. doc-get-arg-type (without the asterisk) seems to be called only by the internals of a one other macro, `doc-do-Bl-args`. I didn't check whether one can construct a `.Bl -whatever .T -foo` call that misbehaves. And I don't know why two different macros are needed here. It seems tempting to unify the logic here but that means going down the mdoc rabbit hole and writing more unit tests, both of which will distract me from other work I need to get done to push groff 1.24.0.rc1. https://savannah.gnu.org/bugs/?65099 (open, non-Documentation items) > or is that macro only used in contexts where treating > the string ".T" as type 1 causes no harm? On first sight, i'm > not sure what is going on, but the code of both macros looks > suspiciously similar. It does, yes. > > . if r doc-punct\*[doc-arg\$1] \ > > . nr doc-arg-type \n[doc-punct\*[doc-arg\$1]] > > . \} > > -. el \ > > -. if r \*[doc-arg\$1] \ > > -. if d \*[doc-arg\$1] \ > > -. nr doc-arg-type 1 > > +. el \{\ > > +. \" Handle the *roff internal string '.T' specially; it is defined > > +. \" in the formatter, which _also_ defines a '.T' _register_, > > +. \" colliding perfectly with mdoc's argument type system. > > +. ie !'\?\*[doc-arg\$1]\?'\?.T\?' \ > > +. if r \*[doc-arg\$1] \ > > +. if d \*[doc-arg\$1] \ > > +. nr doc-arg-type 1 > > +. el \ > > +. nr doc-arg-type 2 > > This else clause looks pointless to me. > The same .nr request is also present as the default, > right above just after the introductory .de request. Ah, you're right. I can optimize the final `el`se out (and change its `ie` counterpart to `if`). Thanks! Regards, Branden
signature.asc
Description: PGP signature
