clone 292671 -1
retitle -1 libwww-perl: LWP produces confusing warning message when downloading 
iso-8859-1 encoded pages
reassign -1 libwww-perl
thanks

On Thu, Feb 24, 2005 at 06:58:14PM +0000, Julian Gilbey wrote:
> On Thu, Feb 24, 2005 at 04:58:46PM +0100, Lo?c Minier wrote:
> >  Let's see with C locale:
> >     bee% rm -rf .devscripts_cache/bts/296747*
> >     bee% LC_ALL=C bts --cache-mode=full --mbox bug 296747
> >     Downloading http://bugs.debian.org/296747 ... Parsing of undecoded
> >     UTF-8 will give garbage when decoding entities at
> >     /usr/share/perl5/LWP/Protocol.pm line 114.
> >     Parsing of undecoded UTF-8 will give garbage when decoding entities
> >     at /usr/share/perl5/LWP/Protocol.pm line 114.
> >     Parsing of undecoded UTF-8 will give garbage when decoding entities
> >     at /usr/share/perl5/LWP/Protocol.pm line 114.
> >     ...

The code which does it is the following:

    my $ua = new LWP::UserAgent;
    my $request = HTTP::Request->new('GET', "http://bugs.debian.org/296747";);
    my $response = $ua->request($request);

and requires a reasonably recent version of libhtml-parser-perl to be
installed (say 3.45-1 will suffice).

Note that this bug report is in charset=iso-8859-1 and has two
adjacent high bit characters in the maintainer's name.  Also, the
page has some entities which are decoded (<), which triggers this
warning.  I don't have any obvious suggestions for how to fix this
bug, though :(

   Julian


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to