Package: icu-devtools Version: 72.1-6 Severity: minor Tags: patch * What led up to the situation?
Checking for defects with a new version test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man page" [Use "groff -e ' $' -e '\\~$' <file>" to find obvious trailing spaces.] ["test-groff" is a script in the repository for "groff"; is not shipped] (local copy and "troff" slightly changed by me). [The fate of "test-nroff" was decided in groff bug #55941.] * What was the outcome of this action? Output from "test-groff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z ": an.tmac:<stdin>:16: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:19: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:22: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:25: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. troff:<stdin>:39: warning: trailing space in the line an.tmac:<stdin>:46: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:49: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:54: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:58: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:63: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:84: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:88: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:92: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:113: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. * What outcome did you expect instead? No output (no warnings). -.- General remarks and further material, if a diff-file exist, are in the attachments. -- System Information: Debian Release: trixie/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 6.12.12-amd64 (SMP w/2 CPU threads; PREEMPT) Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: sysvinit (via /sbin/init) Versions of packages icu-devtools depends on: ii libc6 2.40-7 ii libgcc-s1 14.2.0-17 ii libicu72 72.1-6 ii libstdc++6 14.2.0-17 icu-devtools recommends no packages. icu-devtools suggests no packages. -- no debconf information
Input file is genbrk.1 Output from "mandoc -T lint genbrk.1": (shortened list) 1 input text line longer than 80 bytes: contains a byte orde... 1 whitespace at end of input line Remove trailing space with: sed -e 's/ *$//' -.-. Output from "test-nroff -mandoc -t -ww -z genbrk.1": (shortened list) 9 Use macro '.B' for one argument or split argument. 4 Use macro '.I' for one argument or split argument. 9 .BR is for at least 2 arguments, got 1 4 .IR is for at least 2 arguments, got 1 1 trailing space in the line Remove trailing space with: sed -e 's/ *$//' -.-. Remove space characters (whitespace) at the end of lines. Use "git apply ... --whitespace=fix" to fix extra space issues, or use global configuration "core.whitespace". Number of lines affected is 1 -.-. Use the correct macro for the font change of a single argument or split the argument into two. 16:.BR "\-h\fP, \fB\-?\fP, \fB\-\-help" 19:.BR "\-V\fP, \fB\-\-version" 22:.BR "\-c\fP, \fB\-\-copyright" 25:.BR "\-v\fP, \fB\-\-verbose" 46:.BR "\-h\fP, \fB\-?\fP, \fB\-\-help" 49:.BR "\-V\fP, \fB\-\-version" 54:.BR "\-c\fP, \fB\-\-copyright" 58:.BR "\-v\fP, \fB\-\-verbose" -.-. Wrong distance (not two spaces) between sentences in the input file. Separate the sentences and subordinate clauses; each begins on a new line. See man-pages(7) ("Conventions for source file layout") and "info groff" ("Input Conventions"). The best procedure is to always start a new sentence on a new line, at least, if you are typing on a computer. Remember coding: Only one command ("sentence") on each (logical) line. E-mail: Easier to quote exactly the relevant lines. Generally: Easier to edit the sentence. Patches: Less unaffected text. Search for two adjacent words is easier, when they belong to the same line, and the same phrase. The amount of space between sentences in the output can then be controlled with the ".ss" request. Mark a final abbreviation point as such by suffixing it with "\&". Some sentences (etc.) do not begin on a new line. Split (sometimes) lines after a punctuation mark; before a conjunction. Lines with only one (or two) space(s) between sentences could be split, so latter sentences begin on a new line. Use #!/usr/bin/sh sed -e '/^\./n' \ -e 's/\([[:alpha:]]\)\. */\1.\n/g' $1 to split lines after a sentence period. Check result with the difference between the formatted outputs. See also the attachment "general.bugs" 39:and creates a break iteration data file. Normally this data file has the 89:is interpreted as Unicode. Without the BOM, 98:Specifies the directory containing ICU data. Defaults to 100:Some tools in ICU depend on the presence of the trailing slash. It is thus -.-. Split lines longer than 80 characters into two or more lines. Appropriate break points are the end of a sentence and a subordinate clause; after punctuation marks. Add "\:" to split the string for the output, "\<newline>" in the source. Line 85, length 93 contains a byte order mark (BOM) at the beginning of the file, which is the Unicode character -.-. Split a punctuation from a single argument, if a two-font macro is meant. 86:.B U+FEFF, -.-. One space only after a possible end of sentence (after a punctuation, that can end a sentence). genbrk.1:39:and creates a break iteration data file. Normally this data file has the genbrk.1:89:is interpreted as Unicode. Without the BOM, genbrk.1:98:Specifies the directory containing ICU data. Defaults to genbrk.1:100:Some tools in ICU depend on the presence of the trailing slash. It is thus -.-. Put a subordinate sentence (after a comma) on a new line. genbrk.1:70:For example, the file genbrk.1:85:contains a byte order mark (BOM) at the beginning of the file, which is the Unicode character genbrk.1:93:was written, it is recommended that you write this file in UTF-8 -.-. Output from "test-groff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z ": an.tmac:<stdin>:16: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:19: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:22: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:25: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. troff:<stdin>:39: warning: trailing space in the line an.tmac:<stdin>:46: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:49: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:54: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:58: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. an.tmac:<stdin>:63: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:84: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:88: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:92: misuse, warning: .IR is for at least 2 arguments, got 1 Use macro '.I' for one argument or split argument. an.tmac:<stdin>:113: misuse, warning: .BR is for at least 2 arguments, got 1 Use macro '.B' for one argument or split argument. -.-. Generally: Split (sometimes) lines after a punctuation mark; before a conjunction.
--- genbrk.1 2025-03-10 13:52:38.106194019 +0000 +++ genbrk.1.new 2025-03-10 14:19:30.348268015 +0000 @@ -13,92 +13,97 @@ .SH SYNOPSIS .B genbrk [ -.BR "\-h\fP, \fB\-?\fP, \fB\-\-help" +.BR \-h ", " \-? ", " \-\-help ] [ -.BR "\-V\fP, \fB\-\-version" +.BR \-V ", " \-\-version ] [ -.BR "\-c\fP, \fB\-\-copyright" +.BR \-c ", " \-\-copyright ] [ -.BR "\-v\fP, \fB\-\-verbose" +.BR \-v ", " \-\-verbose ] [ -.BI "\-d\fP, \fB\-\-destdir" " destination" +.BR \-d ", " \-\-destdir " \fIdestination\fP" ] [ -.BI "\-i\fP, \fB\-\-icudatadir" " directory" +.BR \-i ", " \-\-icudatadir " \fIdirectory\fP" ] -.BI "\-r\fP, \fB\-\-rules" " rule\-file" -.BI "\-o\fP, \fB\-\-out" " output\-file" +.BR \-r ", " \-\-rules " \fIrule\-file\fP" +.BR \-o ", " \-\-out " \fIoutput\-file\fP" .SH DESCRIPTION .B genbrk reads the break (boundary) rule source code from .I rule-file -and creates a break iteration data file. Normally this data file has the +and creates a break iteration data file. +Normally this data file has the .B .brk extension. .PP The details of the rule syntax can be found in ICU's User Guide. .SH OPTIONS .TP -.BR "\-h\fP, \fB\-?\fP, \fB\-\-help" +.BR \-h ", " \-? ", " \-\-help Print help about usage and exit. .TP -.BR "\-V\fP, \fB\-\-version" +.BR \-V ", " \-\-version Print the version of .B genbrk and exit. .TP -.BR "\-c\fP, \fB\-\-copyright" +.BR \-c ", " \-\-copyright Embeds the standard ICU copyright into the .IR output-file . .TP -.BR "\-v\fP, \fB\-\-verbose" +.BR \-v ", " \-\-verbose Display extra informative messages during execution. .TP -.BI "\-d\fP, \fB\-\-destdir" " destination" +.BR \-d ", " \-\-destdir " \fIdestination\fP" Set the destination directory of the -.IR output-file +.I output-file to .IR destination . .TP -.BI "\-i\fP, \fB\-\-icudatadir" " directory" +.BR \-i ", " \-\-icudatadir " \fIdirectory\fP" Look for any necessary ICU data files in .IR directory . -For example, the file +For example, +the file .B pnames.icu must be located when ICU's data is not built as a shared library. The default ICU data directory is specified by the environment variable .BR ICU_DATA . Most configurations of ICU do not require this argument. .TP -.BI "\-r\fP, \fB\-\-rules" " rule\-file" +.BR \-r ", " \-\-rules " \fIrule\-file\fP" The source file to read. .TP -.BI "\-o\fP, \fB\-\-out" " output\-file" +.BR \-o ", " \-\-out " \fIoutput\-file\fP" The output data file to write. .SH CAVEATS When the -.IR rule-file -contains a byte order mark (BOM) at the beginning of the file, which is the Unicode character -.B U+FEFF, +.I rule-file +contains a byte order mark (BOM) at the beginning of the file, +which is the Unicode character +.BR U+FEFF , then the -.IR rule-file -is interpreted as Unicode. Without the BOM, +.I rule-file +is interpreted as Unicode. +Without the BOM, the file is interpreted in the current operating system default codepage. In order to eliminate any ambiguity of the encoding for how the -.IR rule-file -was written, it is recommended that you write this file in UTF-8 -with the BOM. +.I rule-file +was written, +it is recommended that you write this file in UTF-8 with the BOM. .SH ENVIRONMENT .TP 10 .B ICU_DATA -Specifies the directory containing ICU data. Defaults to +Specifies the directory containing ICU data. +Defaults to .BR ${prefix}/share/icu/72.1/ . -Some tools in ICU depend on the presence of the trailing slash. It is thus -important to make sure that it is present if +Some tools in ICU depend on the presence of the trailing slash. +It is thus important to make sure that it is present if .B ICU_DATA is set. .SH AUTHORS @@ -110,5 +115,4 @@ Andy Heninger .SH COPYRIGHT Copyright (C) 2005 International Business Machines Corporation and others .SH SEE ALSO -.BR http://www.icu-project.org/userguide/boundaryAnalysis.html - +.B http://www.icu-project.org/userguide/boundaryAnalysis.html
Any program (person), that produces man pages, should check the output for defects by using (both groff and nroff) [gn]roff -mandoc -t -ww -b -z -K utf8 <man page> The same goes for man pages that are used as an input. For a style guide use mandoc -T lint -.- Any "autogenerator" should check its products with the above mentioned 'groff', 'mandoc', and additionally with 'nroff ...'. It should also check its input files for too long (> 80) lines. This is just a simple quality control measure. The "autogenerator" may have to be corrected to get a better man page, the source file may, and any additional file may. Common defects: Not removing trailing spaces (in in- and output). The reason for these trailing spaces should be found and eliminated. "git" has a "tool" to point out whitespace, see for example "git-apply(1)" and git-config(1)") Not beginning each input sentence on a new line. Line length and patch size should thus be reduced. The script "reportbug" uses 'quoted-printable' encoding when a line is longer than 1024 characters in an 'ascii' file. See man-pages(7), item "semantic newline". -.- The difference between the formatted output of the original and patched file can be seen with: nroff -mandoc <file1> > <out1> nroff -mandoc <file2> > <out2> diff -d -u <out1> <out2> and for groff, using \"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - \" instead of 'nroff -mandoc' Add the option '-t', if the file contains a table. Read the output from 'diff -d -u ...' with 'less -R' or similar. -.-. If 'man' (man-db) is used to check the manual for warnings, the following must be set: The option \"-warnings=w\" The environmental variable: export MAN_KEEP_STDERR=yes (or any non-empty value) or (produce only warnings): export MANROFFOPT=\"-ww -b -z\" export MAN_KEEP_STDERR=yes (or any non-empty value) -.-