Bjarni,

At 2024-10-07T23:00:59+0000, Bjarni Ingi Gislason wrote:
> Package: groff
> Version: upstream, GIT HEAD
> Severity: minor
> Tags: patch

If this were a Savannah ticket I would close it as "Invalid"
immediately.

"Unreproducible" also applies.

>    * What led up to the situation?
> 
>      Checking for defects with
> 
> [test-][g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -ww -b -z < "man page"

For those on the groff@ list who don't also subscribe to bug-groff,
allow me to introduce "bjarnigroff", Bjarni's personal fork of groff.

https://savannah.gnu.org/bugs/index.php?go_report=Apply&group=groff&func=browse&set=custom&msort=0&report_id=225&advsrch=0&bug_id=&submitted_by=0&category_id=0&severity=0&bug_group_id=0&resolution_id=0&assigned_to=0&status_id=0&plan_release_id=0&summary=bjarnigroff&history_search=0&history_field=0&history_event=modified&history_date_dayfd=8&history_date_monthfd=10&history_date_yearfd=2024&chunksz=50&spamscore=5&boxoptionwanted=1#options

>   [test-groff is a script in the repository for "groff"] (local copy and
> "troff" slightly changed by me).

It would be more illuminating to note that "test-nroff" is a script of
your creation that does not, and has not ever, existed in GNU groff.

>    * What was the outcome of this action?
> 
> troff: backtrace: file '<stdin>':4162
> troff:<stdin>:4162: warning: trailing space in the line

GNU troff does not emit this diagnostic.

Trailing spaces have (fairly) well-defined semantics in *roff.

roff(7):
     Trailing spaces on text lines ...
     ... are discarded.  The formatter flushes any pending output
     line upon encountering the end of input.

     After the formatter performs an automatic break, it may then adjust
     the line, widening inter‐word spaces until the text reaches the
     right margin.  Extra spaces between words are preserved.  Leading
     and trailing spaces are handled as noted above.  ...

On AT&T troff/nroff, trailing spaces also cancel end-of-sentence
detection.

It might be worth having this as a style warning if/when GNU troff gets
a "style" warning category.

https://savannah.gnu.org/bugs/?62776

> troff: backtrace: file '<stdin>':11
> troff:<stdin>:11: warning: macro 'I' not defined

This is alarming.  The `I` macro has been in the macro language since
1979[1] and there is no prospect of it going away.  This is a very bad
bug in your fork.  I urge you to fix it.

> troff: backtrace: file '<stdin>':18
> troff:<stdin>:18: warning: macro 'MR' not defined

`MR` is a Plan 9 from User Space troff and groff man(7) macro.

groff_man(7):
   Hyperlink macros
     Man page cross references are best presented with .MR.  Text may be
     hyperlinked to email addresses with .MT/.ME or other sorts of URI
     with .UR/.UE.  ...

     .MT, .ME, .UR, and .UE are GNU extensions supported by Heirloom
     Doctools and mandoc (.UR/.UE since 1.12.3; .MT/.ME since 1.14.2)
     but not by Documenter’s Workbench, Plan 9, or Solaris troffs.
     Plan 9 from User Space’s troff implements .MR.  See subsection “Use
     of extensions” in groff_man_style(7).

[I expect mandoc(1) to support `MR` in its next release.[1]]

[...]
     Prepare arguments to .MR, .MT, and .UR for typesetting; they can
     appear in the output.  Use special character escape sequences to
     encode Unicode basic Latin characters where necessary, particularly
     the hyphen‐minus.

     .MR topic [manual‐section [trailing‐text]]
            (since groff 1.23) Set a man page cross reference as
            “topic(manual‐section)”.  If manual‐section is absent, the
            package omits the surrounding parentheses.  If trailing‐text
            (typically punctuation) is specified, it follows the closing
            parenthesis without intervening space.  Hyphenation is
            disabled while the cross reference is set.  topic is set in
            the font specified by the MF string.  If manual‐section is
            present, the cross reference hyperlinks to a URI of the form
            “man:topic(manual‐section)”.
[...]
     Except for .EX/.EE, James Clark implemented the foregoing features
     in early versions of groff.  Later, groff 1.20 (2009) resurrected
     .EX/.EE and originated .SY/.YS, .TQ, .MT/.ME, and .UR/.UE.  Plan 9
     from User Space’s troff introduced .MR in 2020.

> Output from "test-nroff  -mandoc -t -K utf8 -rF0 -rHY=0 -ww -b -z ":

It is not wise to run the formatter on groff_man.7.man.in, because it is
not a *roff document.  It is input to m4.

If I run groff (from Git HEAD) on groff_man.7 with the same flags,
here's what I get:

$ groff -mandoc -t -K utf8 -rF0 -rHY=0 -ww -b -z ./build/tmac/groff_man.7 && 
echo DONE
DONE

Consequently, all of your tool's output is spurious.

I very much hope you are not filing bug reports against other projects
with a tool of such poor quality.

>    * What outcome did you expect instead?
> 
>      No output (no warnings).

In that case you need to examine the properties of your own fork.

>   General remarks and further material, if a diff-file exist, are in the
> attachments.

Hmm, I see.  Much advice, some good and much bad, follows.

>   Any program (person), that produces man pages, should check the output
> for defects by using both groff and nroff.
> 
> [test][g|n]roff -mandoc -t -ww -b -z -K utf8  <man page>

First of all, "test-groff" exists only in the build tree of a groff
built from source.  Practically no one is going to have such a tool
installed.

And as noted above, you may be the only person in the world who has a
script named "test-nroff".

Your notation is pretty confusing if you mean to suggest input to a
shell prompt.  Moreover, you've forgotten about the `-` after `test`.

The above advice therefore will mystify your readers, or draw their
derision if they are already clued in about man page composition and
maintenance.  Possibly, if they've received similar reports from you
before, then they will have learned to ignore you through operant
conditioning.

>   Common defects:
> 
>   Input text line longer than 80 bytes.

You should learn to distinguish style matters from correctness matters.

GNU troff has no problem with input lines far in excess of 80 bytes.

As far as I've seen, mandoc(1) has no such problem either.

>   Not removing trailing spaces (in in- and output).

You overreach here in two respects; first, the semantics of trailing
spaces are well defined as noted above, and secondly, it's none of your
concern or business whether the _input_ to a generator of man(7) applies
semantic value to trailing spaces.  None of man(1), groff(1), or
mandoc(1) care, either.

>   The reason for these trailing spaces should be found and eliminated.

I sometimes feel the same way about your bug reports.

>   Not beginning each input sentence on a new line.
> Lines should thus be shorter.
> 
>   See man-pages(7), item 'semantic newline'.

That's a sound style guide (saith I, who contributed to it).

I hope that people's impressions of it are not tainted by the frequently
misleading content with which you couple your references to it.

[...]
> and for groff, using
> 
> "printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - "

This is going to throw "grout" in the reader's face, which few people
have expertise reading.  It is emphatically _not_ a skill that man page
authors or maintainers need to acquire.  If they experience, with their
man pages, trouble so resistant to comprehension that study of GNU troff
(as opposed to output driver) output is necessary, then they should
locate an expert and ask for help.  This list has several.

(That said, I'd like to make "grout" more readable.  But I have to
negotiate with Deri first. :) )

The foregoing is also revealing of a low level of sophistication with
printf(1).  That utility applies the given format string to _each_ of
its arguments.

On the bright side, that usage may advertise to the reader that your
prescription originates with someone who is advising beyond their level
of expertise, so by all means retain it.

> Output from "mandoc -T lint groff_man.7.man.in": (possibly shortened list)
> 
> mandoc: groff_man.7.man.in:1232:2: WARNING: skipping paragraph macro: PP empty
> mandoc: groff_man.7.man.in:1242:2: WARNING: skipping paragraph macro: PP empty

I don't get these warnings with mandoc 1.14.6--not on the *.in source
document (which one should NOT be giving to mandoc(1) as input in the
first place as noted above), nor the generated man(7) document.  Have
you forked mandoc(1) too?

Here's the output I _do_ get:

$ mandoc -T lint ./build/tmac/groff_man.7
mandoc: ./build/tmac/groff_man.7:32:2: UNSUPP: unsupported roff request: do
mandoc: ./build/tmac/groff_man.7:2260:2: UNSUPP: unsupported roff request: do

These are fine.  mandoc(1) is complaining about our AT&T compatibility
mode management requests.  Go back to this list's archives for 2017, I
think, for the most recent discussion of them.

mandoc: ./build/tmac/groff_man.7:3:5: STYLE: lower case character in document 
title: TH groff_man

We expect a future release of mandoc(1) to withdraw this complaint.
Again, search the list archives for a statement from Ingo to this
effect.

mandoc: ./build/tmac/groff_man.7:1209:2: WARNING: skipping paragraph macro: br 
at the end of SS

This warning is spurious.  Here's the context.

  .\" ====================================================================
  .SS Registers
  .\" ====================================================================
  .
  Registers are described in section \(lqOptions\(rq below.
  .
  They can be set not only on the command line but in the site
  .I man.local
  file as well;
  see section \(lqFiles\(rq below.
  .
  .
  .br
  .ne 7v
  .\" ====================================================================
  .SS Strings
  .\" ====================================================================
  .
  The following strings are defined for use in man pages.

What I'm doing in the foregoing is avoiding "widows"/"orphans"/stranded
paragraph lines.  And resorting to *roff requests to do it, since the
macro package offers no support for this.  (It's a hard problem without
using diversions or, as Doug once noted, "self-renewing input traps", a
gauntlet I'd like to pick up some day.)

Anyway, there's no validity problem here.  A formatter can completely
ignore the `br` and `ne` requests with no harm done to the correctness
of the output.

I'm not sure I'd even ask Ingo to make mandoc(1) detect cases like this
and suppress the warning.  Occasionally, man(7) authors resort to
"expert mode".  It's fine if some tool reminds them that, and where,
they have done so.

> Lines containing '\c' (' \c' does not make sense):

Is that a generalization?

> 25:After processing by m4, both child pages in the above case will carry \c

This is why you shouldn't run man(7) validation tools on things that
aren't man(7) documents.

The foregoing line goes to m4(1)'s "black hole diversion".

> 610:.RB "].\|.\|.\& \e\- "\c

This is a style grievance, not a correctness problem.

Here's the context.

  For example,
  a section called \(lqName\(rq or \(lqNAME\(rq must exist,
  must be the first section after the
  .B .TH
  call,
  and must contain only text of the form
  .RS \" Invisibly move left margin to current .IP indentation.
  .RS \" Now indent further, visibly.
  .IR topic [\c
  .BI , " another-topic"\c
  .RB "].\|.\|.\& \e\- "\c
  .I summary-description
  .RE \" Move left margin back to .IP indentation.
  for tools like
  .MR makewhatis 8
  or
  .MR mandb 8
  to index them.
  .RE \" Move left margin back to standard position.

That's some pretty thick business.  Here's how it formats.

            material within sections.  For example, a section called
            “Name” or “NAME” must exist, must be the first section after
            the .TH call, and must contain only text of the form
                   topic[, another‐topic]... \- summary‐description
            for tools like makewhatis(8) or mandb(8) to index them.

I'm accustomed to writing sequences of macro calls like that as
paragraph tags (see the `TP` macro), for example in a man page's
"Options" section.  You're right that it isn't necessary here.

So, thanks.  That means your message was not completely without value.

> Separate an ellipsis from the preceding string with a space
> character, if it does not mean a continuation of it.
> 
> See a manual of style about the difference between "abc..." and
> "abc ...".
> 
> 4162:To get a \(lqliteral\(rq.\|.\|.  .\|.\|.should be input.
> 4212:Instead of.\|.\|.        .\|.\|.should be considered.
> 4401:Instead of.\|.\|.        .\|.\|.should be considered.
> 4460:Instead of.\|.\|.        .\|.\|.do this.

No, wrong--the existing usage is idiomatic English.  I turn your advice
around and suggest that _you_ consult a style manual.

> Change a HYPHEN-MINUS (code 0x2D) to a minus(-dash) (\-),
> if it
> is in front of a name for an option,
> is a symbol for standard input,
> is a single character used to indicate an option,
> or is in the NAME section (man-pages(7)).
> N.B. - (0x2D), processed as a UTF-8 file, is changed to a hyphen
> (0x2010, groff \[u2010] or \[hy]) in the output.
> 
> 3619:.TP 9.25n \" "-rHY=0" + 2n + hand-tuned for PDF

That's in a comment, Bjarni.  It depicts the width _as formatted_.

> Three full stops (periods) are used for an ellipsis
> 
> 3376:.\" ..and which Clark included in groff man(7) from 1.01 or earlier...

All right, then.  I'll fix this comment.

> Add a zero (0) in front of a decimal fraction that begins with a period
> (.)
> 
> 3540:.\" .5v after, as well as...

No.  Not only is this standard/accepted usage, the numeric expression
`.5v` is valid *roff, too.

> Split a punctuation from a single argument, if a two-font macro is meant
> 
> 228:.I roff;
> 252:.I break.
> 301:.I arguments,
> 490:.I section,
> 500:.I header-middle;
> 569:.I heading-text.
> 635:.I subheading-text.
> 750:.I inset-amount,
> 876:.I indentation,
> 1577:.I trailing-text.
> 1632:.I trailing-text.
> 3205:.I system.
> 3421:.I version.
> 3566:.I groff.
> 3622:.I adjustment-mode,
> 3715:.I footer-distance;
> 3849:.I subsection-indentation.
> 4350:.I level.

Nope.  I said what I meant.  The output looks better this way when
typeset.  However, I tend to set trailing punctuation in roman after an
italicized word if I expect what is italicized to be copy-and-pasted;
that makes it easier to aim one's pointing device for selection.

> Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -ww -b -z ":
> 
> troff: backtrace: file '<stdin>':4162
> troff:<stdin>:4162: warning: trailing space in the line

Not present in groff Git.  Maybe you damaged the document in your fork.

> Additionally (general):
> 
> Abbreviations get a '\&' added after their final full stop (.) to mark them as
> such and not as an end of sentence.
> 
> There is no need to add a '\&' before a full stop (.) if it has a character
> before it!

Why are you prescribing this with respect to groff_man(7)?  You show
exhibits of other things--why not this one?

> -After processing by m4, both child pages in the above case will carry \c
> +After processing by m4, both child pages in the above case will carry
>  escape sequences followed by text lines starting with punctuation one
>  normally does not find in that position (and in the case of the period,
>  which has to be protected from interpretation as a control line).

Once again, this part of the file is not a man(7) document.  See above
regarding m4 and the black hole diversion.  If these terms are
unfamiliar to you, then learn about m4.  There's a nice introduction in
Volume 2 of the Seventh Edition Unix Programmer's Manual.

Furthermore, if you would read the sentence for comprehension, you would
recognize that it is discussing the `\c` escape sequence _specifically_.

> -.I roff;
> +.IR roff ;

No; see above.

> -.I break.
> +.IR break .

No; see above.

>  Some macros interpret
> -.I arguments,
> +.IR arguments ,

No; see above.

> -.I section,
> +.IR section ,

No; see above.

> -.I header-middle;
> +.IR header-middle ;

No; see above.

> -.I heading-text.
> +.IR heading-text .

No; see above.

> -.RB "].\|.\|.\& \e\- "\c
> +.RB "].\|.\|.\&" " \e\- "\c

Acknowledged.  A macro-recast will be in my next push.

> -.I subheading-text.
> +.IR subheading-text .

No; see above.

> -.\" Also see subsection "History" below...
> +.\" Also see subsection "History" below ...

No; see above.

> -.I inset-amount,
> +.IR inset-amount ,

No; see above.

> -.I indentation,
> +.IR indentation ,

No; see above.

> @@ -1229,7 +1229,6 @@ produces the following output.
>  .YS
>  .
>  .
> -.P
>  .SY groff
>  .B \-h
>  .YS

You didn't prepare you reader for this proposed change in any way.

Moreover, it is dead wrong for the forthcoming groff 1.24.

I guess you overlooked or have forgotten the discussions on this list in
about April of this year.

NEWS:
*  The behavior of the an (man) package's `SY` and `YS` macros has been
   expanded to enable greater user control over vertical spacing and to
   make them convenient for synopsizing C language functions, not just
   commands.  `SY` no longer puts vertical space on the output, and
   initially breaks the output line _only_ if it is encountered
   repeatedly without a preceding `YS` call.  The computed indentation
   of synopsis lines after the first now also includes the width of
   anything already on the output line, so that you can precede the `SY`
   call with, for instance, the C language data type used for the return
   value in a function prototype.  The `SY` macro now accepts an
   optional second argument.  This second argument is typeset in bold,
   replaces the fixed-width space that is appended to the synopsis
   keyword in `SY`'s single-argument form, and is used in computation of
   the indentation of non-initial synopsis lines.  However, this
   computed indentation can now also be overridden with that of the
   previous synopsis item.  To do this, give any argument to the `YS`
   macro call "closing" the synopsis whose indentation you want to
   reuse.  When you're done with such a grouped synopsis, leave the
   argument off the final `YS` call.

   In a "Synopsis" section of a man page, existing synopses consisting
   of a single item require no migration.  This is the most common case.

   For others, where before you would write...

   .SY mv
   .I source
   .I destination
   .YS
   .
   .SY mv
   .I source
   \&.\|.\|.
   .I destination-directory
   .YS

   ...you would now write the following.

   .SY mv
   .I source
   .I destination
   .YS
   .
   .
   .P
   .SY mv
   .I source
   \&.\|.\|.
   .I destination-directory
   .YS

   (That is, simply add a paragraphing macro.)

    And where before you would write...

   .SY mv
   .B \-h
   .
   .SY mv
   .B \-\-help
   .YS

    ...you would now write the following.

   .SY mv
   .B \-h
   .YS
   .
   .SY mv
   .B \-\-help
   .YS

   (That is, simply add `YS` after the first synopsis item.)

   Likely the biggest benefit of these changes is that it is now much
   easier to format C function prototypes with these macros.  Here's how
   we would synopsize a somewhat complex standard C library function.

   .B "#include <stdio.h>"
   .P
   .B void *\c
   .SY bsearch (
   .BI const\~void\~*\~ key ,
   .BI const\~void\~*\~ base ,
   .BI size_t\~ nmemb ,
   .BI int\~(* compar )\c
   .B (const\~void\~*, const\~void\~*));
   .YS

> @@ -1239,7 +1238,6 @@ produces the following output.
>  .YS
>  .
>  .
> -.P
>  .SY groff
>  .B \-v
>  .RI [ option\~ .\|.\|.\&]

No; see above.

> -.\" ...because it is followed by characters that are transparent to
> -.\" end-of-sentence detection, and a newline...
> +.\" ... because it is followed by characters that are transparent to
> +.\" end-of-sentence detection, and a newline ...

No; see above.

> -.I trailing-text.
> +.IR trailing-text .

No; see above.

> -.I trailing-text.
> +.IR trailing-text .

No; see above.

> @@ -2318,7 +2316,6 @@ file as well;
>  see section \(lqFiles\(rq below.
>  .
>  .
> -.br
>  .ne 7v
>  .\" ====================================================================
>  .SS Strings
> @@ -2780,7 +2777,7 @@ End a text line without inserting space

No; see above.  In fact is it is _wrong_ to delete only the `br` here,
because then the `ne`eded space will be calculated from the baseline of
a _pending_ output line if one exists, which throws off the computation.
I had to learn this the hard way a few years ago.

> -.\" end-of-sentence detection is performed, and...
> +.\" end-of-sentence detection is performed, and ...

No; see above.

> -.I system.
> +.IR system .

No; see above.

> -.\" ..and which Clark included in groff man(7) from 1.01 or earlier...
> +.\" ... and which Clark included in groff man(7) from 1.01 or earlier ...
>  is deprecated.

Acknowledged.  The dot count will be corrected in my next push.  The
extra space will not appear, as it is incorrect.

> -.I version.
> +.IR version .

No; see above.

> -.\" ...and de-documented .LP...
> +.\" ... and de-documented .LP ...

No; see above.

> -.\" ...as well as \n[PD], which we implement but don't expose.
> +.\" ... as well as \n[PD], which we implement but don't expose.

No; see above.

> -.\" rules (\[br]) as margin characters, as well as...
> +.\" rules (\[br]) as margin characters, as well as ...

No; see above.

> -.\" .5v after, as well as...
> +.\" 0.5v after, as well as ...

No; see above.

> -.\" <https://lists.gnu.org/archive/html/groff/2019-07/msg00038.html>...
> +.\" <https://lists.gnu.org/archive/html/groff/2019-07/msg00038.html> ...

No; see above.

> -.I groff.
> +.IR groff .

No; see above.

> -.\" ...along with implementations of OP, EX, and EE.
> +.\" ... along with implementations of OP, EX, and EE.

No; see above.

> -.TP 9.25n \" "-rHY=0" + 2n + hand-tuned for PDF
> +.TP 9.25n \" "\-rHY=0" + 2n + hand-tuned for PDF

No; see above.

> -.I adjustment-mode,
> +.IR adjustment-mode ,

No; see above.

> -.I footer-distance;
> +.IR footer-distance ;

No; see above.

> -.I subsection-indentation.
> +.IR subsection-indentation .

No; see above.

> @@ -4159,7 +4156,7 @@ this translation is sometimes not desira
>  .TS
>  Lb   Lb
>  RfCR LfCR.
> -To get a \(lqliteral\(rq .\|.\|.\&   .\|.\|.\& should be input.
> +To get a \(lqliteral\(rq .\|.\|.     .\|.\|.should be input.
>  _
>  \(aq \(rs(aq
>  \-   \(rs\-

This case merits comment.  While the dummy character escape sequences
are in fact unnecessary here, because they are embedded in ordinary
(non-text-block) table entries...

tbl(1):
     Ordinarily, a table entry is typeset rigidly.  It is not filled,
     broken, hyphenated, adjusted, or populated with additional inter‐
     sentence space.

...I chose to retain them here for pedagogical reasons.  I would rather
that inexpert man(7) authors _always_ follow their ellipses with `\&`
_unless_ they are deliberately ending a sentence with one.
groff_man_style(7) covers this subject.

For example, if they decide to convert the table to ordinary prose, the
"unnecessary" dummy characters may stop being unnecessary.

> -Instead of.\|.\|.    .\|.\|.should be considered.
> +Instead of .\|.\|.   .\|.\|. should be considered.

No; see above.  (But maybe I should _add_ dummy characters here!)

> @@ -4226,6 +4223,7 @@ _
>  \fR.\|.\|.   .RE
>  _
>  \&.B one two \(dq\(dq three  .B one two three
> +_
>  .TE

You didn't prepare your reader for this proposed change in any way.

I think the table looks okay without a rule at the bottom.  This is a
highly discretionary matter and you should not be mixing such things in
with purported correctness advice without distinguishing it.

> -.I level.
> +.IR level .

No; see above.

> -Instead of.\|.\|.    .\|.\|.should be considered.
> +Instead of .\|.\|.   .\|.\|. should be considered.

No; see above.  (But maybe I should _add_ dummy characters here!)

> @@ -4455,9 +4453,9 @@ when not ending a sentence.
>  .if t .ne 5v
>  .if n .ne 7v \" account for horizontal rules
>  .TS
> -Cb   Cb
> +Lb   Lb

You didn't prepare your reader for this proposed change in any way.

The other "Instead of..."/"...do this." table centers its headings.
Your change would introduce an inconsistency.

Your level of attention to detail is erratic.

>  LfCR LfCR.
> -Instead of.\|.\|.    .\|.\|.do this.
> +Instead of .\|.\|.   .\|.\|. do this.

No; see above.  (But maybe I should _add_ dummy characters here!)

Bjarni, I counsel you to try harder to improve the quality of your
recommendations.  There was far, far more chaff than wheat here.

And don't forget: you're not helping the man page maintenance community
by making unsound style recommendations predicated on the output of a
tool they cannot obtain.

Your attempted elimination of the `I` macro from the man(7) language is
a radioactively bad idea and should not be considered even for a second.

If you continue in this vein, I expect I'll be making many references to
this email in the future when advising man page maintainers.

Irritably,
Branden

[1] https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/man/man7/man.7
[2] 
https://cvsweb.bsd.lv/mandoc/roff.c?rev=1.400&content-type=text/x-cvsweb-markup&sortby=date

Attachment: signature.asc
Description: PGP signature

Reply via email to