Kenshi Muto wrote: > Package: html2text > Version: 1.3.2a-6 > Severity: normal > Tags: experimental l10n > > Hi, > > As I replied at debian-devel, html2text 1.3.2a-6 couldn't handle > (at least) Japanese UTF-8 web page. > > I attached the example tarball. > > * index.html: sample page, was taken from www.debian.org and > modified smaller, and converted the encoding to UTF-8. > * h2t-1.png: browse index.html. > * i1.txt: converted text with -utf8 option. > * h2t-2.png: browse i1.txt. > * i2.txt: converted text without any options. > * h2t-3.png: browse i2.txt. > > In my quick view, there are problems around decorated strings, such > as <h*> or <a>. > > Thanks, Please, try the same with option '-nobs'. Does the problem remain?
-- Eugene V. Lyubimkin aka JackYF, Ukrainian C++ developer.
signature.asc
Description: OpenPGP digital signature