"G. Branden Robinson" <g.branden.robin...@gmail.com> writes: > At 2022-12-23T12:49:15-0800, Russ Allbery wrote:
>> I've been curious: how much use do you see of groff outside of man >> pages? > Others have answered this but I would also point you to Ralph Corderoy's > page on the subject. > https://www.troff.org/pubs.html > It hasn't been updated since about 2006, I think, which means it has > missed a few publications since then, like _The Go Programming Language_ > and Kernighan's _UNIX: A History and Memoir_. Thanks! Happy to see the continuing usage! I probably should have assumed. One of the things that I've noticed over and over about free software is that nothing new ever truly replaces something old in a comprehensive sense. I can think of very few programs that truly no one is using any more, because once the source code is available to keep them alive, someone will keep them alive. It makes for a rather interesting diversity of software (and other things; for instance, I still use Usenet). > The groff_man(7) page has long attempted to prescribe a reasonably > portable, reduced subset of the roff language for use in man pages. > mandoc maintainer Ingo Schwarze and I spent some time prior to groff > 1.22.4's release hammering that out in further detail. Oh, so I was going to mention: currently, Pod::Man rolls its own macros for verbatim text: .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. This looks basically equivalent to .EX/.EE, so I thought about using those macros (and defining my own if they're not available, at least until no one is using older implementations that don't have them). But the main thing that .EX doesn't support that the long-standing Pod::Man behavior does is the .ne invocation, which is used like this: # Get a count of the number of lines before the first blank line, which # we'll pass to .Vb as its parameter. This tells *roff to keep that many # lines together. We don't want to tell *roff to keep huge blocks # together. my @lines = split (m{ \n }xms, $text); my $unbroken = 0; for my $line (@lines) { last if $line =~ m{ \A \s* \z }xms; $unbroken++; } if ($unbroken > 12) { $unbroken = 10; } This logic is very long-standing and was designed for troff printing of a manual page (and older nroff setups that still did pagination) to avoid unnecessary page breaks in the middle of a verbatim block. I'm not sure how much this matters given how people use man pages these days, but I hate to break it for no reason. So I think I'd need to add an .ne line after (before?) the .EE macro if I switched to it? > It's called Pod::_Man_: why would people use it for anything that isn't > a manual page? Okay, fair. :) Although historically people sometimes did, and of course once upon a time people would sometimes typeset the full manual for something with troff. That output probably isn't as nice as it used to, since I have subsequently dropped a lot of the attempted magic that only applied to troff output (replacing paired " quotes with `` '', adding small caps to long strings of all capital letters, and things like that) because they were all using scary regexes and occasionally broke things and mangled things in weird ways, causing lots of maintenance issues. > Yes. But there are two problems to solve: (1) acceptance of Unicode > (probably just UTF-8) input I was pleasantly surprised at how well this just worked with the man-db setup on a Debian system, although I think that may involve a fair amount of preprocessing. > It has been possible for many years (since well before groff 1.22.3) to > specify any Unicode code point for output. Just to provide additional detail for the record (and this is almost certainly the sort of thing you mean by "acceptance of Unicode input") here's the simple document I was using for some testing. https://raw.githubusercontent.com/rra/podlators/main/t/data/man/encoding.utf8 % groff -man -Tpdf -k encoding.utf8 > encoding.pdf troff: encoding.utf8:72: warning: can't find special character 'u0308' troff: encoding.utf8:74: warning: can't find special character 'u1F600' u1F600 is presumably a problem with the output font, but u0308 is a combining accent mark that groff does definitely support, just not as a separate character. (Without preconv, one instead gets mojibake, as I expected.) My theory was that combining accent marks pose a bit of an interesting issue for groff because groff probably shouldn't think of them as a separate output character that can be mapped in an output font, but instead needs to essentially transform them into something like \[u0069_0308] during the input processing. (This may therefore essentially be a preconv bug as opposed to a troff bug, and maybe nroff gets away with it because it can just copy combining accent marks to the output device and let xterm take care of rendering.) It all makes sense when viewed through the lens of the *roff language, but of course in the Unicode world one expects to be able to just produce a stream of code points and have everything cope. > Heirloom Doctools is a descendant of AT&T troff; among other things, it > provides its own man(7) implementation, a lineal descendant of Doug > McIlroy's 1979 original. It _can_ and _does_ render man pages. Whether > any *nix distribution ("platform"?) ships Heirloom as its sole or > preferred *roff, I don't know. I wouldn't be surprised if at least one > BSD does, for the usual reasons of GPL antipathy[2]. About 15 years ago > it undertook a major effort to clone groff features, and it is > reasonably groff compatible when configured to be (`-mg` flag, `xflag` > request, and whatnot). Thanks for the background! > [1] https://www.gnu.org/software/groff/groff-mission-statement.html This is great. I am sad that currently Pod::Man is one of the impediments to good rendering of manual pages in other formats, since I make use of more of the *roff language (mostly to work around bugs) than those tools often understand. So I have an incentive to want to simplify the output as much as I can, consistent with remaining portable. > [2] The CDDL is way _more_ free than the GNU GPL, you see, because it is > a copyleft _and_ has a choice-of-law clause, and someday the BSDs > will have an island microstate nullifying all copyleft licenses. Don't look at me, I release everything under an MIT license. :) -- Russ Allbery (ea...@eyrie.org) <https://www.eyrie.org/~eagle/>