On 26/08/2015 6:04 PM, Jeroen Ooms wrote: > On Tue, Aug 25, 2015 at 10:33 PM, Martin Morgan <mtmor...@fredhutch.org> > wrote: >> >> actually I don't know that it does -- it addresses the symptom but I think >> there should be an error from libcurl on the 403 / 404 rather than from >> read.dcf on error page... > > Indeed, the only correct behavior is to turn the protocol error code > into an R exception. When the server returns a status code >= 400, it > indicates that the request was unsuccessful and the response body does > not contain the content the client had requested, but should instead > be interpreted as an error message/page. Ignoring this fact and > proceeding with parsing the body as usual is incorrect and leads to > all kind of strange errors downstream.
Yes. I haven't been following this long thread. Is it only in R-devel, or is this happening in 3.2.2 or R-patched? If the latter, please submit a bug report. If it is only R-devel, please just be patient. When R-devel becomes R-alpha next year, if the bug still exists, please report it. Duncan Murdoch > > The other download methods did this correctly, it is unclear why the > current implementation of the "libcurl" method does not. Not only does > it lead to hard to interpret downstream parsing errors, it also makes > the behavior of R ambiguous as it is dependent on which download > method is in use. It is certainly not a limitation of the libcurl > library: the 'curl' package has alternative implementations of url() > and download.file() which exercise the correct behavior. > > I can only speculate, but if the motivation is to explicitly support > retrieval of error pages, perhaps the download.file() and url() > functions can gain an argument 'stop_on_error' or something similar > which give the user an option to ignore server errors. However this > behavior should certainly not be the default. When a function or > script contains a line like this: > > download.file("https://someserver.com/mydata.csv", "mydata.csv") > > Then in the next line of code we must be able to expect that the file > "mydata.csv" we have downloaded to our disk is in fact the file > "mydata.csv" that was requested from the server. An implementation > that instead saves an error page (likely html content) to the > "mydata.csv" file is simply incorrect and will lead to obvious > problems, even with a warning. > > > [1] https://www.opencpu.org/posts/cran-https/ > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel