On Nov 21, 2:33 am, [EMAIL PROTECTED] (Rob Dixon) wrote:
> Francois wrote:
> > I tried to get data from a site which use cookies and redirect the
> > user, I spend a lot of time with the same result: connection timed out
> > until I realised that all was fine if I did'nt send the header...
>
> > Thanks for any explanations !!!
> > Francois
>
> > here is my code:
>
> > use strict;
> > use warnings;
>
> > use LWP;
> > use HTML::Parser;
> > use HTML::FormatText;
> > use HTML::Tree;
> > # use DateTime::Duration;
> > use HTTP::Headers;
> > use HTTP::Cookies;
> > use HTTP::Cookies::Netscape;
> > use CGI qw(header -no_debug);
>
> > my $h = HTTP::Headers->new(
> > Accept => "text/xml,application/xml,application/xhtml+xml,text/
> > html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
> > Host => "www.unifr.ch",
> > );
>
> > $h->server("Apache/2.0.46 (Red Hat)");
> > $h->user_agent("Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:
> > 1.8.1.9) Gecko/20071025 Firefox/2.0.0.9");
>
> > my $reflink = "http://linkinghub.elsevier.com/retrieve/pii/
> > S0020138307000095";
>
> > my $c = HTTP::Cookies::Netscape->new(file=>'cookies.txt',
> > autosave=>"1");
> > my $ua_short = LWP::UserAgent->new(cookie_jar => $c, timeout=>
> > 20);
> > $ua_short->agent("Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:
> > 1.8.1.9) Gecko/20071025 Firefox/2.0.0.9");
> > # with this line the header is send with my request and it does
> > not work
> > # my $req = HTTP::Request->new(GET=>$reflink, $h);
>
> > #with this line it's ok ....
> > my $req = HTTP::Request->new(GET=>$reflink);
>
> > my $response =$ua_short->request($req);
> > print header;
> > print $response->status_line,"\n";
> > my $formatter = HTML::FormatText->new();
>
> > if ($response->is_success) {
> > my $tree =
> > HTML::TreeBuilder->new->parse($response->content);
> > my $ascii = $formatter->format($tree);
> > $tree->delete();
> > print $ascii;
> > }
>
> Hi Francois.
>
> As a general rule it's polite to reduce code as much as possible before
> posting it here to ask for help: there's a lot of junk in here that
> isn't relevant to the problem and just needs to be waded through before
> we can give you an answer.
>
> What's going wrong is that you have a Host header value ofwww.unifr.ch
> but you are sending the request to linkinghub.elsevier.com, which
> doesn't have a host of that name and so doesn't reply.
>
> But that's a huge amount of code just to fetch a web page! You may need
> some of that stuff but I can't see how you would want all of it. How
> about just
>
> my $ua = LWP::UserAgent->new;
> my $resp =
> $ua->get('http://linkinghub.elsevier.com/retrieve/pii/S0020138307000095');
>
> which seems to me to do the same thing.
>
> HTH,
>
> Rob
Hi Rob
Many thanks for educating me and for the answer. I tried to post to
libwwww forum without having an answer yet. My wrong host in the
header explains also the troubles I hade with cookies (witch was the
topic on my post there)
Thanks again !
Francois
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/