Hi onf, At 2025-01-20T01:48:19+0100, onf wrote: > Actually, BSD mandoc does implement this, it's just documented at > a poorly visible place in the docs. BSD mandoc's man(1): > MANPAGER > Any non-empty value of the environment variable MANPAGER is > used instead of the standard pagination program, less(1). If > less(1) is used, the interactive :t command can be used to go > to the definitions of various terms, for example command line > options, command modifiers, internal commands, environment > variables, function names, preprocessor macros, errno(2) > values, and some other emphasized words. Some terms may have > defining text at more than one place. In that case, the > less(1) interactive commands t and T can be used to move to the > next and to the previous place providing information about the > term last searched for with :t. The -O tag[=term] option > documented in the mandoc(1) manual opens a manual page at the > definition of a specific term rather than at the beginning. > > And it works quite nicely, actually. The definitions are generated > automatically, so all manpages written in mdoc benefit from it. > I assume groff mdoc + man-db doesn't implement this?
I'm working on it. [requoting] > The definitions are generated automatically That's the rub. We need a design for automatic construction of tag/anchor names from the user-specified names of the items to be tagged. In man(7) documents, those taggable items are probably going to be: 1. the identifier of the page itself, with "section" number; 2. section heading text; 3. subsection heading text; and 4. the tag text of tagged paragraphs (`TP`). Item #1 has already been done for several months and works fine; it can be observed in any "groff-man-pages.pdf" document built from Git. Cross-references between man(7) and mdoc(7) are supported. There are a few remaining problems to be solved. A. Generation of _unique_ hyperlink tags from #2-#4 above. There will be collisions galore under item 2 when multiple man pages are rendered. A page can conceivably collide with itself with respect to items #3 and #4. So we probably want a hierarchical tag representation: page-name/section/subsection/tag-item, where this structure is truncatable at any point after the first slash but is otherwise invariant. B. We need a predictable means of generating hyperlink tag identifiers that is also flexible enough to accommodate non-English languages and weird characters that people might populate their (sub)section titles or paragraph tags with. This requirement exacerbates a painful limitation in groff 1.23 and earlier. It just wasn't going to happen without a change to the GNU troff output language specification that permitted non-ASCII code points _in parameters to device extension commands_ to be expressed. The good news is, that's sorted out now, and comes with a "NEWS" item.[1] Deri was really helpful in sorting out the issues here. (As you're aware, there are knock-on issues not yet resolved to his satisfaction.[2]) For those feel their the burn scars sizzling afresh, this is the root cause of the problem behind groff's most notorious diagnostics, because it applies just as much to output-format-specific document metadata. error("can't transparently output node at top level"); error("can't translate %1 to special character '%2'" " in transparent throughput", input_char_description(cc), ci->nm.contents()); The first happens when you get up to tomfoolery like this: .ds AUTHOR Frank \uand\d Estelle Costanza\" .pdfinfo \Author \*[AUTHOR] And the second when you commit the outrage of having a non-Basic Latin character in your name. .ds AUTHOR Luis Buñuel\" .pdfinfo \Author \*[AUTHOR] The exact same problems apply to document tags/anchors, and for exactly the same reason. We didn't have a specification for encoding such things in device extension commands, also known as "x X" in "grout". See groff_out(5). Okay, I have to go off on a rant here.[3] C. We then need a way to make references to these anchors/tags. For man(7) the `MR` macro new to groff 1.23 was an obvious site to add the appropriate machinery for document-level links. mdoc(7)'s `Xr` is closely analogous and has existed for many years. In the forthcoming groff 1.24 (and in Git right now), they automatically supply hyperlink information for output devices that support such. (Just PDF and terminals.) But there remain two gaps. i. No way to hyperlink in a more fine-grained way, that is to (sub)section headings or, conceivably, to paragraph tags. This is a tougher problem because if these are not unique within a page, the location making the link has to know about the structure of the document. Possibly, we'll just punt on the issue of "deep" cross-document links. mdoc(7) doesn't bother to support that; its `Sx` macro doesn't contemplate pointing into another document.[4] I notice that it, too, doesn't address the problem of duplicate heading names and therefore ambiguous references. Because mdoc(7) culture is rigidly prescriptive, its section headings are tightly controlled, and I expect that this problem only threatens when subsections are used (and referenced). ii. Hyperlinking macros need to be added to ms(7), me(7), and mm(7). Here, at least with mm, the problems of within-document linking may be solvable with less disruption (meaning: no new macros), because the package already supports an internal referencing system. Also, there is likely much less demand for deep links across documents using these packages. If someone's wondering, I'm not a fan of groff_www(7) and don't anticipate using it. As I understand mandoc(1)'s less(1)-integrated tagging feature, none of the problems above are mitigated by feeding the pager an auxiliary tags file (less(1)'s `-T` option). They have to be solved regardless. Steffen Nurpesmo has campaigned repeatedly for extension of OSC 8 hyperlink syntax (or maybe just its semantics) to support anchor placement in addition to linking. I'm dubious of that suggestion. OSC 8 wasn't developed with that in mind and had enough of a hill to climb. Let's see, is that everything? When I'm brain-dumping, sometimes it's hard to tell whether I'm finished. An affliction of age, maybe... Regards, Branden [1] NEWS: * GNU troff now performs some limited processing/transformation of the argument to the `\X` escape sequence and its counterpart `device` request, to address the requirement that some documents have to pass metadata that must encode non-ASCII characters in device extension commands. (For example, a document author may desire a document's section headings containing non-ASCII code points to appear correctly in PDF bookmarks. Further, GNU troff encodes its output page description language only in ASCII.) This change is expected to be of significance mainly to developers of output drivers for groff; groff_diff(7) describes the transformations. If you have been using `\X` or `.device` to pass ASCII data to the output driver as a device extension command and require that it remain precisely as-is, use the `\!` escape sequence or `output` request, and prefix your data with "x X ", the device-independent troff means of expressing a device extension command (see groff_out(5)). [2] https://lists.gnu.org/archive/html/groff/2024-12/msg00168.html [3] "Transparent", along with "special" are my least favorite words in *roff nomenclature, and practically all of the blame can be laid at the Bell Labs CSRC in the 1970s. The Thompson-style naming convention of never using more than two letters to name anything except when compelled at gunpoint had the advantage that no one expected such identifiers to mean anything at all. It's Unix, man. You are not expected to understand it. (And when staring at the muzzle of a firearm, you can comply. Just add the four letters "flag". All done!) [4] groff_mdoc(7): (Sub)section cross references Use the ‘.Sx’ macro to cite a (sub)section heading within the given document. Usage: .Sx ⟨section‐reference⟩ ... .Sx Files → “Files” The default width is 16n.
signature.asc
Description: PGP signature