Dear R-Help,

>From reading the help file, it is my understanding the the download.file()
function does not support HTTPS connections. So therefore, understandably,
the follow produces an error:

### R Code
> url <- "https://stat.ethz.ch/pipermail/r-help/2008-October/thread.html";
> destfile <- "//PFO-SBS001/Redirected/tonyb/Desktop/R_web_test/tmp.txt"
> download.file(url, destfile)
Error in download.file(url, destfile) : unsupported URL scheme

My question is: What about if i remove the 's' from the 'https' url? The
download.file() function seems to now work fine (please see below). Did i
just get lucky with the url I used, or can I in general simply rewrite
'https' as 'http'. My long term goal is to download hundreds of web pages
and then somehow remove all of the html tags so that only the web page text
remains. No private information is being sent or received for this task (no
passwords etc are used).

### R Code
> url <- "http://stat.ethz.ch/pipermail/r-help/2008-October/thread.html";
> destfile <- "//PFO-SBS001/Redirected/tonyb/Desktop/R_web_test/tmp.txt"
> download.file(url, destfile)
trying URL 'http://stat.ethz.ch/pipermail/r-help/2008-October/thread.html'
Content type 'text/html; charset=ISO-8859-1' length 13767 bytes (13 Kb)
opened URL
downloaded 13 Kb

A quick forum search shows that a package called RCurl (Omegahat Repository)
does support HTTPS connections, but i got an error when using that and have
no idea where the omegahat mailing list is, which is why i'd like to know
about removing the 's' in 'https'. If it turns out there is a good reason
not to remove the 's', then i will repost on. God i hope this post makes
sense lol.

Many thanks for your valuable time,
Tony Breyal

Ps. This is my first posting, so please be kind!  :-)
PPs. Sorry this post was so long.
PPPs. For anyone interested, this is what happens when using RCurl:

### R Code
> library(RCurl)
> txt = getURL("
https://stat.ethz.ch/pipermail/r-help/2008-October/thread.html";)
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify
failed

OS: Windows Vista Ultimate
R version: 2.7.2 (2008-08-25)

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to