Thanks for the fast response. I am not sure how to enter the proxy info in the call.
I am working via EZProxy (which I think, rewrites a URL). According to their website it does this: 1. Within the config.txt/ezproxy.cfg file, various hosts are identified that require access from a local IP address. 2. A remote user makes a web connection to port 2048 of your EZproxy server. 3. When the user authenticates successfully, a cookie is sent to the user's browser. 4. The user's browser presents this during each access to EZproxy. So, for example, if I enter URL 1, EZproxy dynamically changes it to URL 2: 1. http://www.scopus.com/results/... 2. http://www-scopus-com.ezproxy.cul.columbia.edu/results/... What kind of proxy information should I look for and where do I enter it in the call? Your help is very much appreciated. Thanks. Duncan Temple Lang wrote > > Apologies for following up on my own mail, but I forgot > to explicitly mention that you will need to specify the > appropriate proxy information in the call to getURLContent(). > > D. > > On 6/7/12 8:31 AM, Duncan Temple Lang wrote: >> To just enable cookies and their management, use the cookiefile >> option, e.g. >> >> txt = getURLContent(url, cookiefile = "") >> >> Then you can pass this to readHTMLTable(), best done as >> >> content = readHTMLTable(htmlParse(txt, asText = TRUE)) >> >> >> The function readHTMLTable() doesn't use RCurl and doesn't >> handle cookies. >> >> D. >> >> On 6/7/12 7:33 AM, mdvaan wrote: >>> Hi, >>> >>> I am trying to access a website and read its content. The website is a >>> restricted access website that I access through a proxy server (which >>> therefore requires me to enable cookies). I have problems in allowing >>> Rcurl >>> to receive and send cookies. >>> >>> The following lines give me: >>> >>> library(RCurl) >>> library(XML) >>> >>> url <- "http://www.theurl.com" >>> content <- readHTMLTable(url) >>> >>> content >>> $`NULL` >>> >>> >>> >>> V1 >>> 1 >>> >>> >>> 2 >>> >>> >>> Cookies disabled >>> 3 >>> >>> >>> 4 Your browser currently does not accept cookies.\rCookies need to be >>> enabled for Scopus to function properly.\rPlease enable session cookies >>> in >>> your browser and try again. >>> >>> $`NULL` >>> V1 V2 V3 >>> 1 >>> >>> $`NULL` >>> V1 >>> 1 Cookies disabled >>> >>> $`NULL` >>> V1 >>> 1 >>> 2 >>> 3 >>> >>> I have carefully read section 4.4. from this: >>> http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following >>> without >>> succes: >>> >>> curl <- getCurlHandle() >>> curlSetOpt(cookiejar = 'cookies.txt', curl = curl) >>> >>> Any suggestions on how to allow for cookies? >>> >>> Thanks. >>> >>> Math >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help@ mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@ mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4632714.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.