[Groff] Applications of \c in man pages in the wild [LONG] (was: Nesting font macros in man pages)

G. Branden Robinson Wed, 26 Apr 2017 05:44:11 -0700

At 2017-04-25T19:00:26-0400, Doug McIlroy wrote:
> > .TP
> > .B \-scale \c
> > .IR xfac [, yfac ]
> 
> Very clever. I wish I'd thought of it when I was editing
> the v7 manual. Then it would have become a standard idiom.


Thanks, Doug!

> Has nobody tried this during the nearly 40 years since?

Well, that exact example doesn't work with the stock GNU troff man
macros, because as noted earlier, TP sets an input trap for the next 1
lines (.it 1), counting interrupted lines.

However, the trick of using \c to "get out of" a man font macro without
introducing unwanted whitespace has some precedent (see below).  A few
souls appear to have figured it out before.

So in hopes of settling the issue of the meaning of \c and its usage in
man pages, I undertook a survey of several thousand man pages on my
Debian stretch system.  I documented my method below[1].

I found uses of \c in 210 out of 6,939 non-symlink man pages.

In the following survey I'll skip over uses of \c inside macro
definitions, as that sort of use is what Ingo and I seem to agree is
"expert mode".

\~\c at the ends of lines seems to be a reasonably common technique for
preventing a line break when using font-changing macros in phrases like

    For each \~\c
    .I x
    such that
    .I x\~\c
    <\~0,

Examples of this usage can be found in chem(1), eqn(1), gropdf(1),
and grops(1).

Man pages generated by docbook-to-man, like those in git and patchutils,
use \c inside expressions like this, apparently generated by
docbook-to-man.

    .sp
    .RS 4
    .ie n \{\
    \h'-04' 1.\h'+01'\c
    .\}
    .el \{\
    .sp -1

tracker-extract(1) follows the above pattern too (without the
conditional on nroff), but the man page has some basic style problems,
looks hand-written, and I think the usage was cargo-culted from
elsewhere.

chmod(1) uses \c simply to break a really long line.  I think they meant
"\<newline>" instead.  Notably, the page was generated by help2man.

dvipdfmx(1) seems to have a legitimate use case:

    .SH SYNOPSIS
    .B dvipdfmx
    or
    .B dvipdfm
    .RI [ options ]
    .I file\c
    .RB [ .dvi ]

dvipdft(1) is similar.

extractbb(1) uses \c to get out of a font-change macro without
introducing a word break:

    .PP
    For each
    .SM JPEG\c
    ,
    .SM PNG\c
    , or
    .SM PDF
    file given on the command line,

...as does gawk(1):

    .TP
    .B ?:
    The C conditional expression.  This has the form
    .IB expr1 " ? " expr2 " : " expr3\c
    \&.

...and grep(1):

    .SS "Back References and Subexpressions"
    The back-reference
    .BI \e n\c
    \&, where

...and groff(1):

    Include macro file
    .IB name .tmac
    (or
    .BI tmac. name\c
    ); see also
    .BR \%groff_tmac (5).

...and groffer(1):

    A locale name is typically of the form
    .nh
    .IR language [\c
    .B _\c
    .IR territory [\c
    .B .\c
    .IR codeset [\c
    .B @\c
    .IR modifier ]]],
    .hy

...and grohtml(1):

    .TP
    .B \-h
    Generate section and number headings by using
    .BR <B> .\|.\|. </B>
    and increasing the font size, rather than using the
    .BI <H n >\c
    \&.\|.\|.\c
    .BI </H n >
    tags.

...and imgtool(1):

    .br
    .B imgtool get
    .I format image file
    .RI [ newname ]
    .RB [ \-\-filter=\c
    .IR filter ]
    .RB [ \-\-fork=\c
    .IR fork ]
(similar examples follow in the same page)

...and (as noted in the forked-from thread) ksh93(1):

    .SS Field Splitting.
    After parameter expansion and command substitution,
    the results of substitutions are scanned for the field separator
    characters (those found in
    .SM
    .B IFS\^\c
    )
    and split into distinct fields where such characters are found.

...and man(1), which uses this trick extensively:

    .B man
    .RB [\| \-C
    .IR file \|]
    .RB [\| \-d \|]
    .RB [\| \-D \|]
    .RB [\| \-\-warnings \|\c
    .RI [\|= warnings \|]\|]
    .RB [\| \-R
[...]
    .RB [\| \-t \|]
    .RB [\| \-T \|\c
    .RI [\| device \|]\|]
    .RB [\| \-H \|\c
    .RI [\| browser \|]\|]
    .RB [\| \-X \|\c
    .RI [\| dpi \|]\|]
(14 more instances in the page, not including:)
    .\"
    .\" Need a \c to make sure we don't get a space where we don't want one
    .\"

Groff's own gdiffmk appears to have some useless uses of \c:

    .OP \-d \%deletemark
    [\ \c
    .B \-D
    .OP \-B
    .OP \-M "mark1 mark2"
    ]
    .OP \-x \%diffcmd
    .OP \-\-
    .OP \-\-help
    .OP \%\-\-version
    .I \%file1
    .I \%file2
    [\ \c
    .IR \%output \ \c
    ]
    .br

I rewrote the above as follows with no visible effect on the utf8 and ps
devices:

--- gdiffmk.stock.1     2017-04-26 08:13:55.757049173 -0400
+++ gdiffmk.hacked.1    2017-04-26 08:08:56.875417280 -0400
@@ -53,7 +53,7 @@
 .OP \-a \%addmark
 .OP \-c \%changemark
 .OP \-d \%deletemark
-[\ \c
+[
 .B \-D
 .OP \-B
 .OP \-M "mark1 mark2"
@@ -64,8 +64,8 @@
 .OP \%\-\-version
 .I \%file1
 .I \%file2
-[\ \c
-.IR \%output \ \c
+[
+.I \%output
 ]
 .br
 .ad \na

grodvi(1) uses the following construction, whose purpose I cannot guess:

    The fonts are grouped into families
    .B T
    and
    .B H\ \c
    having members in each of these styles:

In context, both escapes look like no-ops to me.  Maybe a bold space is
slightly wider than a roman one on some output devices?  Since it's an
adjustable space anyway I wonder at the signficance.  Perhaps there was
a third font family once, and this .B used to be a .BR which needed more
care.

That gets me to page 110 of the 210 I'm looking at.

I'll see what kind of commentary my survey provokes before returning to
the remainder, but I expect it'll be mostly more of the same.

Contra Ingo, I find little in these uses of \c to shock me.  I was more
shocked by some of the stuff I saw inserted by transformation tools that
generate man documents as output.

Regards,
Branden

[1] Method:

$ find /usr/share/man/man* -type f -and -not -type l \
        | xargs zgrep -l '[^\]\\c' | sort >| /tmp/backslash-c.txt

signature.asc
Description: PGP signature

[Groff] Applications of \c in man pages in the wild [LONG] (was: Nesting font macros in man pages)

Reply via email to