Hi Branden,

On Fri, May 02, 2025 at 09:29:07AM -0500, G. Branden Robinson wrote:
> At 2025-05-02T14:42:12+0200, Alejandro Colomar wrote:
> > By default, I prefer keeping adjustment.  Often, I want to see changes
> > in adjustment too as part fo the diff.  Maybe I should add an option to
> > disable adjustment optionally, which could be useful in those cases
> > where the diff is a bit hard to understand.
> 
> For myself, I found that editorial changes to recast wording or
> otherwise add and remove material led to cascading reports of
> differences _only_ to spaces in adjusted lines, which usually aren't of
> interest to me.

I've changed my mind.  I think it's better to disable it by default in
diffman-git(1), and I can enable it easily anyway.  I've applied the
following patch:

<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=637b0aa571b61d98c717e7ab7490df8a3d9e4841>

commit 637b0aa571b61d98c717e7ab7490df8a3d9e4841
Author: Alejandro Colomar <a...@kernel.org>
Date:   Fri May 2 17:08:20 2025 +0200

    src/bin/diffman-git: Disable adjustment by default
    
    One can still enable it by setting an empty MANROFFOPT.
    
    Suggested-by: "G. Branden Robinson" <bran...@debian.org>
    Signed-off-by: Alejandro Colomar <a...@kernel.org>

diff --git a/src/bin/diffman-git b/src/bin/diffman-git
index ede506c91..25c0a98b6 100755
--- a/src/bin/diffman-git
+++ b/src/bin/diffman-git
@@ -31,6 +31,7 @@ git rev-parse --show-toplevel | read -r dir;
 cd "$dir";
 
 test -v MAN_KEEP_FORMATTING || export MAN_KEEP_FORMATTING=1;
+test -v MANROFFOPT          || export MANROFFOPT='-d AD=l';
 
 # shellcheck disable=SC2206  # We want only non-empty variables in the array.
 opts=($s $w $u);

> > > for P in *.[157]
> > > do
> > >     if [ "$P" = groff_mmse.7 ]
> > >     then
> > >       LOCALE=-msv
> > >     else
> > >       LOCALE=
> > >     fi
> > 
> > What's -msv?
> 
> groff_tmac(5):
> 
>    Localization packages
>      For Western languages, an auxiliary package for localization sets
>      the hyphenation mode and loads hyphenation patterns and exceptions.
>      Localization files can also adjust the date format and provide
>      translations of strings used by some of the full‐service macro
>      packages; alter the input encoding (see the next section); and
>      change the amount of additional inter‐sentence space.  For Eastern
>      languages, the localization file defines character classes and sets
>      flags on them.  By default, troffrc loads the localization file for
>      English.
> ...
>      sv     Swedish; localizes man, me, mm, mom, and ms.  Sets the input
>             encoding to Latin‐1 by loading latin1.tmac.  Some of the
>             localization of the mm package is handled separately; see
>             groff_mmse(7).

Hmmm.

> > >     echo $0: $P >&2
> > >     echo "groff $ARGS $LOCALE $P" > "$P.cR.txt"
> > >     groff $ARGS $LOCALE "$P" >> "$P.cR.txt"
> > > ...
> > > done
> > 
> > Would you mind sharing the entire script?  I might get ideas for
> > improving diffman-git(1).
> 
> Sure; it's crude and dumb (like its author?)--I don't generally spend a
> lot of software engineering effort on stuff I produce only for my own
> consumption.  I've attached it.  The script name is revealing of some of
> my music listening habits.
> 
> > (And maybe you can drop your script if
> > diffman-git(1) would be good-enough for you.)
> 
> If it stops working for the limited purpose I require it, I may look
> into alternatives.  :)

I suggest you try it.  I has some nice features, like specifying the
amount of context lines, or ignoring white space changes (which is
useful to confirm that some change only affects spacing but nothing
else).  It also allows you to diff arbitrary commits, without having to
store a copy of the formatted output.

> > The RE movement is intended to indent the "Since Linux 6.7," para.
> 
> I'd need to look at more context, and haven't, but `IP` already does
> that.

That para was a continuation of a TP, and now is changed to be a
continuation of a nested TP (thus the RS).  See the diff with some more
context, which might clarify:

$ MANWIDTH=72 diffman-git -U20 HEAD^
--- HEAD^^:man/man2const/TIOCLINUX.2const
+++ HEAD^:man/man2const/TIOCLINUX.2const
@@ -24,75 +24,84 @@
             Get task information.  Disappeared in Linux 1.1.92.
 
      subcode=TIOCL_SETSEL
             Set selection.  argp points to a
 
                 struct {
                     char  subcode;
                     short xs, ys, xe, ye;
                     short sel_mode;
                 };
 
             xs and ys are the starting column and row.  xe and ye are
             the ending column and row.  (Upper left corner is row=col‐
             umn=1.)  sel_mode may be one of the following operations:
 
             TIOCL_SELCHAR
                    Select character‐by‐character.  The indicated screen
                    characters are highlighted and saved in a kernel
                    buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELWORD
                    Select word‐by‐word, expanding the selection out‐
                    wards to align with word boundaries.  The indicated
                    screen characters are highlighted and saved in a
                    kernel buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELLINE
                    Select line‐by‐line, expanding the selection out‐
                    wards to select full lines.  The indicated screen
                    characters are highlighted and saved in a kernel
                    buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELPOINTER
                    Show the pointer at position (xs, ys) or (xe, ye),
                    whichever is later in text flow order.
 
             TIOCL_SELCLEAR
                    Remove the current selection highlight, if any, from
                    the console holding the selection.
 
                    This does not affect the stored selected text.
 
             TIOCL_SELMOUSEREPORT
                    Make the terminal report (xs, ys) as the current
                    mouse location using the xterm(1) mouse tracking
                    protocol (see console_codes(4)).  The lower 4 bits
                    of sel_mode (TIOCL_SELBUTTONMASK) indicate the de‐
                    sired button press and modifier key information for
                    the mouse event.
 
                    If mouse reporting is not enabled for the terminal,
                    this operation yields an EINVAL error.
 
-            Since Linux 6.7, using this subcode requires the
-            CAP_SYS_ADMIN capability.
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_PASTESEL
             Paste selection.  The characters in the selection buffer
             are written to fd.
 
             Since Linux 6.7, using this subcode requires the
             CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_UNBLANKSCREEN
             Unblank the screen.
 
      subcode=TIOCL_SELLOADLUT
             Sets contents of a 256‐bit look up table defining charac‐
             ters in a "word", for word‐by‐word selection.  (Since Linux
             1.1.32.)
 
             Since Linux 6.7, using this subcode requires the
             CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_GETSHIFTSTATE


> #!/bin/bash
> 
> set -e
> 
> if [ $# -ne 1 ]
> then
>     echo "need a directory argument (e.g., \"old\", \"new\")" >&2
>     exit 1
> fi

Being diffman-git(1), it uses the git(1) repository to find the old
pages and the new ones.  No need to specify paths.

> 
> if ! [ -x ./build/test-groff ]
> then
>     echo "./build/test-groff does not exist or is not executable" >&2
>     exit 2
> fi
> 
> groff () {
>     ../build/test-groff "$@"
> }

I use man(1), so it would be a matter of passing an appropriate PATH to
run your development groff version.

> 
> BFLAG=
> #BFLAG=-b
> DIR=$1
> 
> MANS=(
> ./src/utils/lkbib/lkbib.1.man
> ./src/utils/tfmtodit/tfmtodit.1.man
> ./src/utils/hpftodit/hpftodit.1.man
> ./src/utils/pfbtops/pfbtops.1.man
> ./src/utils/afmtodit/afmtodit.1.man
> ./src/utils/lookbib/lookbib.1.man
> ./src/utils/addftinfo/addftinfo.1.man
> ./src/utils/xtotroff/xtotroff.1.man
> ./src/utils/indxbib/indxbib.1.man
> ./src/roff/nroff/nroff.1.man
> ./src/roff/troff/troff.1.man
> ./src/roff/groff/groff.1.man
> ./src/utils/grog/grog.1.man
> ./src/devices/grodvi/grodvi.1.man
> ./src/devices/grolbp/grolbp.1.man
> ./src/devices/grops/grops.1.man
> ./src/devices/grohtml/grohtml.1.man
> ./src/devices/grolj4/grolj4.1.man
> ./src/devices/grotty/grotty.1.man
> ./src/devices/gropdf/gropdf.1.man
> ./src/devices/gropdf/pdfmom.1.man
> ./src/devices/xditview/gxditview.1.man
> ./src/preproc/preconv/preconv.1.man
> ./src/preproc/tbl/tbl.1.man
> ./src/preproc/soelim/soelim.1.man
> ./src/preproc/eqn/eqn.1.man
> ./src/preproc/eqn/neqn.1.man
> ./src/preproc/pic/pic.1.man
> ./src/preproc/refer/refer.1.man
> ./src/preproc/grn/grn.1.man
> ./contrib/pic2graph/pic2graph.1.man
> ./contrib/hdtbl/groff_hdtbl.7.man
> ./contrib/mm/groff_mm.7.man
> ./contrib/mm/mmroff.1.man
> ./contrib/grap2graph/grap2graph.1.man
> ./contrib/rfc1345/groff_rfc1345.7.man
> ./contrib/eqn2graph/eqn2graph.1.man
> ./contrib/gpinyin/gpinyin.1.man
> ./contrib/mom/groff_mom.7.man
> ./contrib/gdiffmk/gdiffmk.1.man
> ./contrib/glilypond/glilypond.1.man
> ./contrib/chem/chem.1.man
> ./contrib/gperl/gperl.1.man
> ./man/groff_tmac.5.man
> ./man/groff_out.5.man
> ./man/groff_diff.7.man
> ./man/groff_char.7.man
> ./man/groff.7.man
> ./man/roff.7.man
> ./man/groff_font.5.man
> ./tmac/groff_trace.7.man
> ./tmac/groff_me.7.man
> ./tmac/groff_ms.7.man
> ./tmac/groff_man.7.man
> ./tmac/groff_man_style.7.man
> ./tmac/groff_mdoc.7.man
> ./tmac/groff_www.7.man
> )

I calculate the MANS dynamically with a regex:

        case $# in
        0)  git diff --name-only;               ;;
        1)  git diff --name-only "$1^..$1";     ;;
        *)  git diff --name-only "$1..$2";      ;;
        esac \
        | grep -E 
'(\.[[:digit:]]([[:alpha:]][[:alnum:]]*)?\>|\.man)+(\.man|\.in)*$' \
        | sortman \

> 
> MANS_SV=(
> ./contrib/mm/groff_mmse.7.man
> )
> 
> mkdir "$DIR"
> pushd "$DIR" >/dev/null
> 
> # the change logs, so we know approximately where we are
> cp ../ChangeLog .
> 
> for d in chem gdiffmk glilypond gperl gpinyin hdtbl mm mom rfc1345 sboxes
> do
>       cp ../contrib/$d/ChangeLog ./ChangeLog.$d
> done
> 
> # our Texinfo manual
> cp ../build/doc/groff.txt .
> 
> # our Texinfo manual via HTML
> cp ../build/doc/groff.html .
> lynx -dump groff.html > groff.html.txt
> 
> # our ms manuals
> groff $BFLAG -ww -Tutf8 -ept -ms ../doc/ms.ms > ms.txt
> 
> # our me manuals
> #groff $BFLAG -ww -Tutf8 -me ../doc/meintro.me > meintro.txt
> #groff $BFLAG -ww -Tutf8 -kt -me -mfr ../doc/meintro_fr.me > meintro_fr.txt
> #groff $BFLAG -ww -Tutf8 -me ../doc/meref.me > meref.txt
> me_pre=../ATTIC/my.me
> groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meintro.me > meintro.txt
> groff $BFLAG -ww -Tutf8 -kt -me -mfr $me_pre ../build/doc/meintro_fr.me \
>     > meintro_fr.txt
> groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meref.me > meref.txt
> 
> for F in ${MANS[*]} ${MANS_SV[*]}
> do
>     G=../build/${F%.man}
>     if [ -f "$G" ]
>     then
>         cp "$G" .
>     else
>         echo "warning: \"$G\" missing" >&2
>     fi
> done
> 
> : ${AD:=l}
> 
> ARGS="$BFLAG -ww -dAD=$AD -rCHECKSTYLE=3 -rU1 -Tutf8 -e -t -mandoc"
> NOCR=-rcR=0
> LOCALE=
> ARGS_HTML="$BFLAG -ww -rCHECKSTYLE=3 -Thtml -e -t -mandoc -P-C -P-G"
> 
> for P in *.[157]
> do
>     if [ "$P" = groff_mmse.7 ]
>     then
>       LOCALE=-msv
>     else
>       LOCALE=
>     fi
> 
>     echo $0: $P >&2
>     echo "groff $ARGS $LOCALE $P" > "$P.cR.txt"
>     groff $ARGS $LOCALE "$P" >> "$P.cR.txt"
>     echo "groff $ARGS $LOCALE $NOCR $P" > "$P.no-cR.txt"
>     groff $ARGS $LOCALE $NOCR "$P" >> "$P.no-cR.txt"
>     echo "<!-- groff $ARGS_HTML $LOCALE -P-I$P $P -->" > "$P.html"
>     groff $ARGS_HTML $LOCALE -P-I$P $P >> "$P.html"
>     rm "$P"
> done

Hmmm, my script is dumber; it only calls man(1).  But I guess that's
enough.  You can always tell man(1) to pass stuff to groff(1).

        | sortman \
        | while read -r f; do \
                case $# in
                0)  old="HEAD:$f";  new="./$f";   ;;
                1)  old="$1^:$f";   new="$1:$f";  ;;
                *)  old="$1:$f";    new="$2:$f";  ;;
                esac;

                case $# in
                0)  cat "$new";       ;;
                *)  git show "$new";  ;;
                esac \
                | man /dev/stdin \
                | diff --label "$old" --label "$new" "${opts[@]}" \
                        <(git show "$old" | man /dev/stdin) \
                        /dev/stdin \
                || true;
        done;

> 
> popd >/dev/null

popd(1) at the end of a script is not useful.  Or do you source the
script?

> 
> # vim:set ai et sw=4 ts=4 tw=80:


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature

Reply via email to