clair.crossup...@googlemail.com wrote:
Thank you Duncan.

I remember seeing in your documentation that you have used this
'verbose=TRUE' argument in functions before when trying to see what is
going on. This is good. However, I have not been able to get it to
work for me. Does the output appear in R or do you use some other
external window (i.e. MS DOS window?)?


The libcurl code typically defaults to print on the console.
So on the Windows GUI, this will not show up. Using
a shell (MS DOS window or Unix-like shell) should
should cause the output to be displayed.

A more general way however is to use the debugfunction
option.

d = debugGatherer()

getURL("http://uk.youtube.com";,
        debugfunction = d$update, verbose = TRUE)

When this completes, use

 d$value()

and you have the entire contents that would be displayed on the console.


 D.



library(RCurl)
my.url <- 
'http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2'
getURL(my.url, verbose = TRUE)
[1] ""


I am having a problem with a new webpage (http://uk.youtube.com/) but
if i can get this verbose to work, then i think i will be able to
google the right action to take based on the information it gives.

Many thanks for your time,
C.C.


On 26 Jan, 16:12, Duncan Temple Lang <dun...@wald.ucdavis.edu> wrote:
clair.crossup...@googlemail.com wrote:
Dear R-help,
There seems to be a web page I am unable to download using RCurl. I
don't understand why it won't download:
library(RCurl)
my.url <- 
"http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...";
getURL(my.url)
[1] ""
  I like the irony that RCurl seems to have difficulties downloading an
article about R.  Good thing it is just a matter of additional arguments
to getURL() or it would be bad news.

The followlocation parameter defaults to FALSE, so

   getURL(my.url, followlocation = TRUE)

gets what you want.

The way I found this  is

  getURL(my.url, verbose = TRUE)

and take a look at the information being sent from R
and received by R from the server.

This gives

* About to connect() towww.nytimes.comport 80 (#0)
*   Trying 199.239.136.200... * connected
* Connected towww.nytimes.com(199.239.136.200) port 80 (#0)
 > GET /2009/01/07/technology/business-computing/07program.html?_r=2
HTTP/1.1
Host:www.nytimes.com
Accept: */*

< HTTP/1.1 301 Moved Permanently
< Server: Sun-ONE-Web-Server/6.1
< Date: Mon, 26 Jan 2009 16:10:51 GMT
< Content-length: 0
< Content-type: text/html
< 
Location:http://www.nytimes.com/glogin?URI=http://www.nytimes.com/2009/01/07/t...
<

And the 301 is the critical thing here.

  D.



Other web pages are ok to download but this is the first time I have
been unable to download a web page using the very nice RCurl package.
While i can download the webpage using the RDCOMClient, i would like
to understand why it doesn't work as above please?
library(RDCOMClient)
my.url <- 
"http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...";
ie <- COMCreate("InternetExplorer.Application")
txt <- list()
ie$Navigate(my.url)
NULL
while(ie[["Busy"]]) Sys.sleep(1)
txt[[my.url]] <- ie[["document"]][["body"]][["innerText"]]
txt
$`http://www.nytimes.com/2009/01/07/technology/business-computing/
07program.html?_r=2`
[1] "Skip to article Try Electronic Edition Log ...
Many thanks for your time,
C.C
Windows Vista, running with administrator privileges.
sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32
locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base
other attached packages:
[1] RDCOMClient_0.92-0 RCurl_0.94-0
loaded via a namespace (and not attached):
[1] tools_2.8.1
______________________________________________
r-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to