Hi Larry, At 2024-02-22T21:37:12-0500, Larry Kollar wrote: > I’m a little late to the party, but I’ve read Alex’s original post > over several times, and I have to wonder if everyone is over-thinking > this.
Yes and no. > > On Feb 16, 2024, at 10:21 AM, Alejandro Colomar <a...@kernel.org> wrote: > > I've been thinking about a suggestion I've done in the past. I > > wanted a program that reads man(7) source and produces roff(7) > > source, so that it can later be passed to troff(1), thus splitting > > the groff(1) pipeline a bit more. The idea is similar to how eqn(1) > > and other pre-troff filters do their job. > > There has to be a phase during which (g)troff interprets the macros > and produces roff(7) to feed to the main processor. Not really. *roff macro interpolation is not like running the C preprocessor; if it were, what you suggest would work fine. But the C preprocessor's macro language is a pretty feeble demonstration of a macro language. For one thing, there are 3 sorts of things that undergo interpolation: macros, registers, and strings. Furthermore the identifiers of these can be constructed using interpolations of others--and this is actually done in groff macro packages. Beyond that, macros can define other macros (groff ms(7) does this, for instance). Here's an example from s.tmac. .\" par*define-font-macro macro font apply-italic-corrections .de par*define-font-macro .de \\$1 .ds par*lic \" empty .ds par*ic \" empty .if \\n[.$]>2 \{\ . as par*lic \,\" . as par*ic \/\" .\} .if \En[.$]>3 .@warning excess arguments to .\\$1 ignored .ie \En[.$] \{\ . nr par*prev-font \En[.f] \&\E$3\E*[par*lic]\f[\\$2]\E$1\f[\En[par*prev-font]]\E*[par*ic]\E$2 .\} .el .ft \\$2 \\.. .. .par*define-font-macro R R .par*define-font-macro B B .par*define-font-macro I I yes .par*define-font-macro BI BI yes .ie n .par*define-font-macro CW R .el .par*define-font-macro CW CR Also, macros are allowed to call themselves recursively and in fact this is the traditional means of implementing a loop in AT&T troff. (Do many man pages need loops? No. But see below.) > Would it be possible to add a new command line option (like —roff) > that simply dumps the input with macros applied, then stops? No. Not "simply". It is true that man(7) documents tend to use *roff only in a pretty basic way. This is due to a combination of factors. 1. Unfamiliarity with the formatter on the part of man page authors (the sorts of people who hate writing documentation _really_ hate _reading_ it); 2. Authors of non-roff man page interpreters failing to support all *roff features a page might use (one can hardly blame them); 3. groff's own documentation recommending only a limited, portable subset of man(7) and formatter features to avoid frustration on the part of page authors and readers. The foregoing is something of a self-reinforcing cycle; the smaller the language we prescribe as suitable for man(7) composition, the easier it seems to be to do what you and Alex are asking for. This is the road tools like doclifter(1) and mandoc(1) started down years ago. The problem, as ever, is an 80/20 rule, or 90/10, 95/5 one. With only a little bit of effort you can knock together something that seems startlingly capable. With enough effort, you can handle a large majority of a given corpus...but the remaining outliers prove more and more difficult and demanding.[1] You won't get to 100% without implementing a fully armed and operational *roff, and a GNU troff-compatible one at that. GNU troff itself just isn't written this way, to do only one "level" of interpolation and stop. I'm reluctant to even look into seeing if it can be stuck in, because my instinct is that too much would break. Even in our man(7) package, we have macros calling other macros. There's no flag bit on any of them indicating that they're "top-level" macros. Finally, unless your not-roff interpreter simulates the vertical drawing position and updates it--which it would probably have to do via something pretty close to operating the way a formatter does, with simulated line-filling and breaking--you won't be able to reliably troubleshoot page headers and footers with this technique. Regards, Branden [1] A notable example is going to be any page that uses tbl(1). If you've ever looked at what the tbl preprocessor emits, you know that you're going to need a lot of machinery to handle it. This might even be the first hurdle at which initially promising ad hoc man(7) interpreters fall. You could of course interpret tbl(1) input for yourself...again, you will find yourself measuring text and formatting it. Or you could just skip anything that's in a table, which is fine until that's the part of the page that needs illumination...
signature.asc
Description: PGP signature