Package: lintian Version: 2.3.4 Severity: normal I think lintian is wrong in testing encoding issue for manapage. I have installed hello-debhelper (2.5-1). then I downloaded its binary package hello-debhelper_2.5-1_amd64.deb and extracted to hello.1 into a working directry.
In short, instead of using complicated test, it should use iconv for encoding test. Let me show this problem. First hello-debhelper package installed manpage shows OK under both LANG=C and LANG=en_US.UTF-8. I tested it with $ LANG=C man hello $ LANG=en_US.UTF-8 man hello The only difference is copyright line. LANG=C shows copuright as (C) while UTF-8 uses fancy ©. No problem. But $ lintian -i hello-debhelper_2.5-1_amd64.deb W: hello-debhelper: manpage-has-errors-from-man usr/share/man/man1/hello.1.gz Invalid or incomplete multibyte or wide character N: N: This man page provokes warnings or errors from man. N: N: "cannot adjust" or "can't break" are trouble with paragraph filling, N: usually related to long lines. Adjustment can be helped by left N: justifying, breaks can be helped with hyphenation, see "Manipulating N: Filling and Adjusting" and "Manipulating Hyphenation" in the manual. N: N: "can't find numbered character" usually means latin1 etc in the input, N: and this warning indicates characters will be missing from the output. N: You can change to escapes like \[:a] described on the groff_char man N: page. N: N: Other warnings are often formatting typos, like missing quotes around a N: string argument to .IP. These are likely to result in lost or malformed N: output. See the groff_man (or groff_mdoc if using mdoc) man page for N: information on macros. N: N: This test uses man's --warnings option to enable groff warnings that N: catch common mistakes, such as putting . or ' characters at the start of N: a line when they are intended as literal text rather than groff N: commands. This can be fixed either by reformatting the paragraph so that N: these characters are not at the start of a line, or by adding a N: zero-width space (\&) immediately before them. N: N: At worst, warning messages can be disabled with the .warn directive, see N: "Debugging" in the groff manual. N: N: To test this for yourself you can use the following command: N: LANG=C MANWIDTH=80 man --warnings -E UTF-8 -l manpage-file >/dev/null N: N: Severity: normal, Certainty: certain N: $ LANG=C MANWIDTH=80 man --warnings -E UTF-8 -l hello.1 >hello.txt col: Invalid or incomplete multibyte or wide character $ iconv -f utf8 -t utf8 hello.1 >/dev/null && echo "UTF-8 compatible" || echo "non-UTF-8 found" UTF-8 compatible $ iconv -f ascii -t ascii hello.1 >/dev/null && echo "ascii compatible" || echo "non-ascii found" ascii compatible The first test is the one used by lintian. Second and third test is mine to check encoding of source code itself. $ LANG=C MANWIDTH=80 man --warnings -l hello.1 >hello.c.txt $ LANG=en_US.UTF-8 MANWIDTH=80 man --warnings -E UTF-8 -l hello.1 >hello.u.txt $ LANG=C MANWIDTH=80 man --warnings -E UTF-8 -l hello.1 >hello.cu.txt col: Invalid or incomplete multibyte or wide character $ ls -l hello.*.txt -rw-rw-r-- 1 osamu osamu 1417 Mar 28 09:21 hello.c.txt -rw-rw-r-- 1 osamu osamu 0 Mar 28 09:53 hello.cu.txt -rw-rw-r-- 1 osamu osamu 1418 Mar 28 09:21 hello.u.txt $ diff -u hello.*.txt --- hello.c.txt 2010-03-28 09:21:07.000000000 +0900 +++ hello.u.txt 2010-03-28 09:21:26.000000000 +0900 @@ -32,7 +32,7 @@ General help using GNU software: <http://www.gnu.org/gethelp/> COPYRIGHT - Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU + Copyright © 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. The corresponding groff source has "\co" as in Copyright \(co 2010 Free Software Foundation, Inc. This is embedded nroff which is handled OK for both locale. So the situation is clear. There is no non-ASCII code in the source of manpage. Its manpage can be interpretted proprly with current tool set. But test used by lintian breaks on groff copyright mark code. I made hellox.1 in which "\co" is replaced with UTF-8 "©". This is real error :-) $ iconv -f ascii -t ascii hellox.1 >/dev/null && echo "ascii compatible" || echo "non-ascii found" iconv: illegal input sequence at position 828 non-ascii found $ iconv -f utf8 -t utf8 hellox.1 >/dev/null && echo "UTF-8 compatible" || echo "non-UTF-8 found" UTF-8 compatible $ LANG=C MANWIDTH=80 man --warnings -E UTF-8 -l hellox.1 >hellox.cu.txt col: Invalid or incomplete multibyte or wide character $ LANG=en_US.UTF-8 MANWIDTH=80 man --warnings -E UTF-8 -l hellox.1 >hellox.u.txt $ LANG=C MANWIDTH=80 man --warnings -l hellox.1 >hellox.c.txt $ ls -l hellox.*.txt -rw-rw-r-- 1 osamu osamu 1417 Mar 28 10:03 hellox.c.txt -rw-rw-r-- 1 osamu osamu 0 Mar 28 10:02 hellox.cu.txt -rw-rw-r-- 1 osamu osamu 1418 Mar 28 10:03 hellox.u.txt $ diff -u hellox.c.txt hellox.u.txt --- hellox.c.txt 2010-03-28 10:03:28.000000000 +0900 +++ hellox.u.txt 2010-03-28 10:03:12.000000000 +0900 @@ -32,7 +32,7 @@ General help using GNU software: <http://www.gnu.org/gethelp/> COPYRIGHT - Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU + Copyright © 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. groff is smart enough to de-UTF-8 from "©" to "(C)". Simple iconv test detects error. -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-3-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages lintian depends on: ii binutils 2.20.1-3 The GNU assembler, linker and bina ii diffstat 1.47-1 produces graph of changes introduc ii dpkg-dev 1.15.5.6 Debian package development tools ii file 5.04-1 Determines file type using "magic" ii gettext 0.17-10 GNU Internationalization utilities ii intltool-debian 0.35.0+20060710.1 Help i18n of RFC822 compliant conf ii libapt-pkg-perl 0.1.24 Perl interface to libapt-pkg ii libclass-accessor-perl 0.34-1 Perl module that automatically gen ii libipc-run-perl 0.84-1 Perl module for running processes ii libparse-debianchangel 1.1.1-2 parse Debian changelogs and output ii libtimedate-perl 1.2000-1 collection of modules to manipulat ii liburi-perl 1.53-1 module to manipulate and access UR ii locales-all [locales] 2.10.2-6 Embedded GNU C Library: Precompile ii man-db 2.5.7-2 on-line manual pager ii perl [libdigest-sha-pe 5.10.1-11 Larry Wall's Practical Extraction lintian recommends no packages. Versions of packages lintian suggests: pn binutils-multiarch <none> (no description available) pn libtext-template-perl <none> (no description available) ii man-db 2.5.7-2 on-line manual pager -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org