I want to use CURL to download news from Wall Street Journal Online. When you visit the WSJ home page, you're forwarded to an authentication page to enter your name and password, and then forwarded back to the home page. I want my CURL command to send the authentication cookie so when it's forwarded to the authentication page it forwards right back to the home page without having to enter the name and password.
I can get the following CURL command to run fine at the command prompt, but not in PHP:
THIS WORKS
curl --cookie "WSJIE_LOGIN=blahblahblah" -L -O "http://online.wsj.com/home/us"
THIS DOESN'T WORK $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://online.wsj.com/home/us"); curl_setopt($ch, CURLOPT_COOKIE, "WSJIE_LOGIN=blahblahblah"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $content = curl_exec($ch); curl_close($ch);
I used a packet sniffer to see how this works. When I request the home page (above) and send the WSJIE_LOGIN cookie, the home page redirects to the authentication page. The authentication page uses the WSJIE_LOGIN cookie to generate more cookies. Then these 5-6 cookies are sent back to the home page and give the user access to the content. The WSJIE_LOGIN cookie is my own personal authentication cookie; the other cookies change from time to time. But I noticed that the PHP CURL isn't perpetuating these other cookies when it forwards back to the home page, like the command-line CURL does. Here are blocks from the package capture:
CLI CURL
...
192.168.001.100.63745-206.157.193.068.00080: GET /home/us HTTP/1.1
User-Agent: curl/7.10.2 (powerpc-apple-darwin7.0) libcurl/7.10.2 OpenSSL/0.9.7b zlib/1.1.4
Cookie: WSJIE_LOGIN=abc
Host: online.wsj.com
Pragma: no-cache
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
Cookie: fastlogin=xyz; wsjproducts=xyz; user_type=xyz; REMOTE_USER=xyz; UBID=xyz
...
PHP CURL ... 192.168.001.100.63750-206.157.193.068.00080: GET /home/us HTTP/1.1 Cookie: WSJIE_LOGIN=abc Host: online.wsj.com Pragma: no-cache Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* ...
PHP's curl doesn't forward the cookies that it is given at the previous page, so, of course, I don't get my content. Any ideas why?
Richard Miller
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php