Package: html2text Version: 1.3.2a-15 Severity: normal Dear Maintainer,
The following simple html file causes an "input recoding" failure: $ cat sample.htm <html><body> <table BORDER="1"> <tr><td> </td></tr> </table> </body></html> $ html2text sample.htm Input recoding failed due to invalid input sequence. Unconverted part of text follows. �| $ Removing or replacing the non-breakable space or setting the border to 0 allows html2text to process the file correctly. Placing a character (or multiple characters) after the non-breakable space also allows html2text to process the file correctly, although the first character after the non-breakable space is not displayed. I was able to replicate the failure on Squeeze (so it's not a new bug). -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (500, 'testing'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 3.2.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages html2text depends on: ii libc6 2.13-35 ii libgcc1 1:4.7.1-7 ii libstdc++6 4.7.1-7 html2text recommends no packages. Versions of packages html2text suggests: ii curl 7.26.0-1 ii wget 1.13.4-3 -- no debconf information