clone 292671 -1 retitle -1 libwww-perl: LWP produces confusing warning message when downloading iso-8859-1 encoded pages reassign -1 libwww-perl thanks
On Thu, Feb 24, 2005 at 06:58:14PM +0000, Julian Gilbey wrote: > On Thu, Feb 24, 2005 at 04:58:46PM +0100, Lo?c Minier wrote: > > Let's see with C locale: > > bee% rm -rf .devscripts_cache/bts/296747* > > bee% LC_ALL=C bts --cache-mode=full --mbox bug 296747 > > Downloading http://bugs.debian.org/296747 ... Parsing of undecoded > > UTF-8 will give garbage when decoding entities at > > /usr/share/perl5/LWP/Protocol.pm line 114. > > Parsing of undecoded UTF-8 will give garbage when decoding entities > > at /usr/share/perl5/LWP/Protocol.pm line 114. > > Parsing of undecoded UTF-8 will give garbage when decoding entities > > at /usr/share/perl5/LWP/Protocol.pm line 114. > > ... The code which does it is the following: my $ua = new LWP::UserAgent; my $request = HTTP::Request->new('GET', "http://bugs.debian.org/296747"); my $response = $ua->request($request); and requires a reasonably recent version of libhtml-parser-perl to be installed (say 3.45-1 will suffice). Note that this bug report is in charset=iso-8859-1 and has two adjacent high bit characters in the maintainer's name. Also, the page has some entities which are decoded (<), which triggers this warning. I don't have any obvious suggestions for how to fix this bug, though :( Julian -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]