Re: [Groff] Nesting font macros in man pages

Ingo Schwarze Sat, 29 Apr 2017 09:44:35 -0700

Hi Ralph,

Ralph Corderoy wrote on Sat, Apr 29, 2017 at 03:47:53PM +0100:
> Ingo Schwarze wrote:


>> .Fl S Ar var Ns Op Pf = Ar value

> This has reminded me of one reason I didn't get on with mdoc.
> Only the `Fl' is obviously mdoc's, due to the `.' invocation.
> The rest are a mix of command and data, but without any sigil
> one's left guessing unless the command set has been memoised.
> It's not helped by the data typically being small abbreviations
> for variables and sample parameters rather than the prose words
> you might find for running English text, thus they look like
> commands.

As a matter of fact, it *is* helped by *exactly* that.

For novice manual page authors, one of the simple rules i routinely
recomend is this:  "On a macro line, if a word consists of two
letters, the first one capital and the second one not, assume it's
a macro.  If you don't intend it as a macro, escape it with \&."

It turns out almost nothing gets uselessly escaped when following
that rule because capitalized two-letter words are quite rare on
macro lines in practice.

Even in plain english text, capitalized two-letter words are somewhat
rare.  The most common case is short words like "No" and "It", but
only when they occur at the beginning of a sentence.  But plain
English text is rarely seen on macro lines, and in particular the
beginning of a plain English sentence almost never is.

> If, in some made up syntax, it was
> 
>     .Fl S .Ar var .Ns .Op .Pf = .Ar value
> 
> that might be more grokable,

Remebering the above very simple rule about Xx-style words, it is
about the same; maybe the shorter version is even easier to read.

> though still long-winded for `-S var[=value]'.

It is not long-winded at all.  Even in man(7), which has no sematic
annotation, it is

.B -S
.IR var [= value ]

That's 33 bytes (long-winded!?) in mdoc(7) vs. 25 bytes in man(7),
so it's 8 (eight) additional bytes for expressing four semantic
annotations, among them three different semantic annotations:
that "-S" is a flag, that "[]" means "optional", and that "var"
and "value" are placeholders.

Given that in a different program, "var[= value]" or "var[ =value]"
or "var[ = value]" might be the required syntax, and given that
in most syntax, separating tokens by whitespace is the usual case
and joining tokens without whitespace in between is more unusual,
even the design of the no-whitespace macros .Ns and .Pf is fully
optimized for conciseness and simplicity.

This example provided by Steffen in mockery is actually an
excellent example to demonstrate that mdoc(7) syntax meets the
theoretic maximum of conciseness and clarity.  You want to specify:

 * 1+3+5 = 9 letters (S, var, value)    -> min. 9 bytes, 3 tokens
 * 1 punctuation mark (=)               -> min. 1 byte, 1 token
 * 4 semantic annotations (Fl, 2Ar, Op) -> min. 8 bytes, 4 tokens
 * 2 spacing exceptions                 -> min. 4 bytes, 2 tokens
 * token separators for 10 tokens       -> min. 10 bytes of whitespace

That assumes nothing except that expressing a semantic annotation
requires more than one byte (which anybody knowing XML is likely
to readily admit) and that we want to separate tokens with whitespace
(which both human readers and parser programs will probably like).
Note that i did not even count those output characters against the
theoretical minimum that are implicit in the semantic annotations
(the "-" in the .Fl and the "[]" in the .Op).

So the theoretical optimum is 32 bytes.  The mdoc(7) code is 33
bytes.  You call that "long-winded"?  Oh, that additional byte is
the leading roff(7) macro dot, which is actually yet another semantic
annotation ("not plain text") this time really expressed in a single
byte...

The opposite criticism to "long-winded" was "cryptic,
we should use .als".  I don't buy that either.  Here is the
complete list of macros recommended for use in modern mdoc(7):

  .Dd     Document Date   \" preamble macros
  .Dt     Document Title
  .Os     Operating System
  .Nm     NaMe
  .Nd     Name section Description

  .Sh     Section Header   \" structural macros
  .Ss     SubSection header
  .Sx     Section Xref
  .Xr     XRef
  .Pp     Paragraph
  .Bl/El  Block/End List
  .Bd/Ed  Block/End Display
  .D1     Display 1-line
  .Dl     Display Literal
  .Ql     Quote Literal
  .It     ITem in list
  .Ta     TAble cell separator
  .Rs/Re  Reference Start/End

  .Pf     PreFix   \" Spacing control macros
  .Ns     NoSpace

  .Fl     FLag   \" Command-line semantic macros
  .Cm     Command modifier
  .Ar     ARgument
  .Op     OPtional (to end of input line)
  .Oo/Oc  OPtional Open/Close
  .Ic     Internal/Interactive Command
  .Ev     Environment Variable
  .Pa     PAth

  .In     INclude file   \" Function library semantic macros
  .Fd     Function #Define
  .Ft     Function return Type
  .Fn     Function Name
  .Fo/Fc  Function Open/Close
  .Fa     Function Argument
  .Vt     Variable Type
  .Va     VAriable name
  .Dv     #Defined Variable
  .Er     ERrno constant

  .An     Author Name   \" misc semantic macros
  .Lk     hyperLinK
  .Mt     MailTo
  .Cd     kernel Configuration Declaration
  .Em     stress EMphasis
  .Sy     SYmbolic (misnomer, what is ~ HTML5 <strong>)
  .Li     LIteral

  .Dq/Do/Dc  Double Quote/Open/Close    \" Physical enclosures
  .Qq/Qo/Qc  ASCII-" Quote/Open/Close
  .Sq/So/Sc  Single Quote/Open/Close
  .Pq/Po/Pc  Parenthesis Quote/Open/Close
  .Bq/Bo/Bc  square Bracket Quote/Open/Close
  .Eo/Ec     generic Enclosure Open/Close

  .Ex -std   standard command EXit values
  .Rv -std   standard function Return Values
  .St        STandard reference

That's just one page of cheat sheet, available in almost exactly
this form in the mdoc(7) manual, and *all* macro names are strongly
mnemonic, with many additional regularities like the Xq/Xo/Xc
pattern.  Even for an mdoc(7) beginner, it is not really difficult
to pick good macros looking at such a list, at least for people who
have ever coded in any programming or markup language, for example
in C or HTML.  Besides, most beginners start by copying and editing
existing manuals, which simplifies getting to speed further.

Note that longer macro names would make matters much worse, not only
for writing, but also for reading:

  .Fl S Ar var Ns Op Pf = Ar value

  .CommandOption S .CommandArgument var .NoSpace\
    .OptionalToEol = .NoSpace .CommandArgument value

Which of these is more readable?  Also compare to the output
of mandoc -Thtml (title attribute deleted for brevity):

  <b class="Fl">-S</b> <var class="Ar">var</var>[<span class="Op">=<var
  class="Ar">value</var></span>]

Or even:

  <b class="CommandOption">-S</b>
  <var class="CommandArgument">var</var>[<span class="Optional">=<var
  class="CommandArgument">value</var></span>]

The basic concept of mdoc(7) is extremely close to the theoretical
optimum.  I don't deny there are a few minor quirks in the design
and naming of a few of the individual macros that could have been
designed better, but regarding all the common criticism, if you
actually try to design improvements based on it, you find that it
would only make matters worse.

Yours,
  Ingo

Re: [Groff] Nesting font macros in man pages

Reply via email to