Hi Colin, At 2025-03-26T12:46:18+0000, Colin Watson wrote: > I'd welcome something more robust based on groff, as long as people > remember to consider both sides of the problem (extraction of msgids, > and reassembly of pages using msgstrs).
Yes, I expect that implementing member functions in "src/roff/troff/
node.cpp" to produce output upon running "groff -A pod" would be just
the first of two implementation phases. The second would be ensuring
that the output is arranged well for interpretation by po4a.
I'm attaching (knock wood) "groff -a" output of the ncurses beep(3) man
page (because it's short but has enough content to illustrate practical
properties of interest) and a hand-made mock-up of envisioned "groff -A
pod" output.
Notes on the mock-up:
1. I think we can know enough in "node.cpp" to break the output line at
sentence boundaries if the additional inter-sentence spacing amount
is not zero--and the default is not zero. (Evidence: see recent
demonstrations of the `pline` request, which illustrate that this
information is carried into the `word_space_size` node type. No man
page I know of attempts to override that amount; the ability do do
so is a GNU troff extension.
2. I expect that explicit line breaks will be honored (reflected in the
output). I don't expect that to be a problem for msgid boundary
inference.
3. Representation of the page offset may be erratic and/or inaccurate.
I expect msgid extractors to discard leading and trailing whitespace
anyway.
4. I don't know how POD quotes/escapes < and > characters; I'll need to
learn.
5. At this point in the formatting process, the formatter's notion of a
font is an integer referring to a mounting position. We don't know
what the font "is". The current font is also a property of the
environment, not of nodes per se. But: (a) we know when the font
selection _changes_, and (b) for man page formatting I'll bet we can
assume that fonts are mounted in traditional order: 1, 2, 3, 4 -> R,
I, B, BI.[1]
6. Text in a man page that uses special characters (trout/grout: the
"C" command) probably doesn't need to be translated.
One exception: as usual we'd likely special-case what "groff -a"
renders as `<->` and `<hy>` as good old `-`, and punt (warn on and
ignore) any other special character.
This approach would probably _not_ be satisfactory for a man page
whose "base" language was not English, but as far as I know, no
project both does that and supports gettextization of the page. But
Helge Kreutzmann would know better than I would.
Thoughts?
Regards,
Branden
[1] That isn't _quite_ traditional: "R, I, B, S" is, because in Ossanna
troff there was no such thing as a bold-italic typeface--at least
not that the formatter knew of as such. But I'm betting we can get
away with the slight modification, and bold italics are seldom used
in man pages anyway because there's no interface for selecting that
style in the macro package. The "BI" font remains a part of the
long tail we can capture with this approach nonetheless.
<beginning of page> beep(3NCURSES) Library calls beep(3NCURSES) NAME beep, flash <-> ring the (visual) bell of the terminal with curses SYNOPSIS #include <ncursesw/curses.h> int beep(void); int flash(void); DESCRIPTION beep and flash alert the terminal user: the former by sounding the termi<hy> nal's audible alarm, and the latter by visibly attracting attention. Com<hy> monly, a terminal implements a visual bell by momentarily reversing the character foreground and background colors on the entire display; even a monochrome device can do this. These functions each attempt the other alert type if the one requested is unavailable. If neither is available, curses performs no action. Nearly all terminals have an audible alert mechanism such as a bell or piezoelectric buzzer, but only some can flash the screen. RETURN VALUE These functions return OK on success and ERR on failure. In ncurses, beep and flash return OK if the terminal type supports the cor<hy> responding capability: bell (bel) for beep and flash_screen (flash) for flash. Otherwise they return ERR. EXTENSIONS In ncurses, these functions can return ERR. PORTABILITY X/Open Curses Issue 4 describes these functions. It specifies no error conditions for them. On SVr4 curses, they always return OK, and X/Open Curses specifies them as doing so. HISTORY SVr2 (1984) introduced beep and flash. SEE ALSO ncurses(3NCURSES), terminfo(5) ncurses 6.5 2025-02-01 beep(3NCURSES)
I<beep>(3NCURSES) Library calls I<beep>(3NCURSES) B<NAME> B<beep>, B<flash> - ring the (visual) bell of the terminal with I<curses> B<SYNOPSIS> B<#include <ncursesw/curses.h>> B<int beep(void);> B<int flash(void);> B<DESCRIPTION> B<beep> and B<flash> alert the terminal user: the former by sounding the terminal's audible alarm, and the latter by visibly attracting attention. Commonly, a terminal implements a visual bell by momentarily reversing the character foreground and background colors on the entire display; even a monochrome device can do this. These functions each attempt the other alert type if the one requested is unavailable. If neither is available, I<curses> performs no action. Nearly all terminals have an audible alert mechanism such as a bell or piezoelectric buzzer, but only some can flash the screen. RETURN VALUE These functions return B<OK> on success and B<ERR> on failure. In I<ncurses>, B<beep> and B<flash> return B<OK> if the terminal type supports the corresponding capability: B<bell> (B<bel>) for B<beep> and B<flash_screen> (B<flash>) for B<flash>. Otherwise they return B<ERR>. EXTENSIONS In I<ncurses>, these functions can return B<ERR>. PORTABILITY X/Open Curses Issue 4 describes these functions. It specifies no error conditions for them. On SVr4 I<curses>, they always return I<OK>, and X/Open Curses specifies them as doing so. HISTORY SVr2 (1984) introduced B<beep> and B<flash>. SEE ALSO B<ncurses>(3NCURSES), B<terminfo>(5) ncurses 6.5 2025-02-01 I<beep>(3NCURSES)
signature.asc
Description: PGP signature
