[Rd] Can gzfile be given the same method option as file
Recently my employer has introduced a security system which generates SSL certificates on the fly to be able to see the content of https connections. To make this work they add a new root certificate to the windows certificate store. In R this causes problems because the default library used to download data from URLs doesn't look at this store, however the "wininet" download method works so where this is used then things work (albeit with a warning about future deprecation). For functions like download.file this works great, but it fails when running readRDS: readRDS('https://seurat.nygenome.org/azimuth/references/homologs.rds') Error in gzfile(file, "rb") : cannot open the connection In addition: Warning message: In gzfile(file, "rb") : cannot open compressed file 'https://seurat.nygenome.org/azimuth/references/homologs.rds', probable reason 'Invalid argument' After some debugging I see that the root cause is from the gzfile function. > gzfile('https://seurat.nygenome.org/azimuth/references/homologs.rds') -> g > open(g, open="r") Error in open.connection(g, open = "r") : cannot open the connection In addition: Warning message: In open.connection(g, open = "r") : cannot open compressed file 'https://seurat.nygenome.org/azimuth/references/homologs.rds', probable reason 'Invalid argument' If this was not a compressed file then using file rather than gzfile we can make this work by setting the url.method option: > options("url.method"="wininet") > file('https://seurat.nygenome.org/azimuth/references/homologs.rds') -> g > open(g, open="r") Warning message: In open.connection(g, open = "r") : the 'wininet' method of url() is deprecated for http:// and https:// URLs So I get a warning, but it works. I guess this boils down to two questions: 1. Is it possible to add the same "method" argument to gzfile that file uses so that people in my situation have a work round? 2. Given the warnings we're getting when using wininet, are their plans to make windows certficates be supported in another way? Thanks Simon. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Can gzfile be given the same method option as file
Thankyou! This helped a lot. I had mis-understood some of the chain of functions which got to the eventual failure. I can confirm that it does indeed work if you create a url() first and it picks the appropriate back end as long as the url.method option is set. For the schannel back end I have: > libcurlVersion() [1] "8.6.0" attr(,"ssl_version") [1] "(OpenSSL/3.2.1) Schannel" attr(,"libssh_version") [1] "libssh2/1.11.0" However I can't get either of the curl related methods to work. > download.file('https://seurat.nygenome.org/azimuth/references/homologs.rds', > destfile = "c:/Users/andrewss/homologs.rds", method="libcurl") trying URL 'https://seurat.nygenome.org/azimuth/references/homologs.rds' Error in download.file("https://seurat.nygenome.org/azimuth/references/homologs.rds";, : cannot open URL 'https://seurat.nygenome.org/azimuth/references/homologs.rds' In addition: Warning message: In download.file("https://seurat.nygenome.org/azimuth/references/homologs.rds";, : URL 'https://seurat.nygenome.org/azimuth/references/homologs.rds': status was 'SSL connect error' > download.file('https://seurat.nygenome.org/azimuth/references/homologs.rds', > destfile = "c:/Users/andrewss/homologs.rds", method="curl") % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (35) schannel: next InitializeSecurityContext failed: CRYPT_E_NO_REVOCATION_CHECK (0x80092012) - The revocation function was unable to check revocation for the certificate. Error in download.file("https://seurat.nygenome.org/azimuth/references/homologs.rds";, : 'curl' call had nonzero exit status I realise that this may not be as simple as the certificate not being seen, and that the system here may not fake the revocation infrastructure too, but I don't see that this is going to change, and it's only the winet method which actually allows anything to connect. Simon. -Original Message----- From: Ivan Krylov Sent: 12 September 2024 15:28 To: Simon Andrews via R-devel Cc: Simon Andrews Subject: Re: [Rd] Can gzfile be given the same method option as file В Thu, 12 Sep 2024 12:01:54 + Simon Andrews via R-devel пишет: > readRDS('https://seurat.nygenome.org/azimuth/references/homologs.rds') > Error in gzfile(file, "rb") : cannot open the connection I don't think that gzfile works with URLs. gzcon(), on the other hand, does work with url() connections, which accepts the 'method' argument and the getOption('url.method') default. h <- readRDS(url( 'https://seurat.nygenome.org/azimuth/references/homologs.rds' )) But that only works with gzip-compressed files. For example, CRAN's PACKAGES.rds is xz-compressed, and I don't see a way to read it the same way: readBin( index <- file.path( contrib.url(getOption('repos')['CRAN']), 'PACKAGES.rds' ), raw(), 5 ) |> rawToChar() # [1] "\xfd7zXZ" <-- note the "7zXZ" header readRDS(url(index)) # Error in readRDS(url(index)) : unknown input format > 2. Given the warnings we're getting when using wininet, are their > plans to make windows certficates be supported in another way? What does libcurlVersion() return for you? In theory, it should be possible to make libcurl use schannel and therefore the system certificate store for TLS verification purposes. -- Best regards, Ivan This email has been scanned for spam & viruses. If you believe this email should have been stopped by our filters, click the following link to report it (https://portal-uk.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zaW1vbi5hbmRyZXdzQGJhYnJhaGFtLmFjLnVrO3RzPTE3MjYxNTEyNjU7dXVpZD02NkUyRkE2MDJDNjczRjRCREUwOTMxQUM4NTdCNkY3Nzt0b2tlbj1lNWI5MzU2NGJmOWE1MTcwYmM4ZmY2YjNhNTYwMWQ5ZmFkOTU2YWE1Ow%3D%3D). __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Can gzfile be given the same method option as file
В Thu, 12 Sep 2024 15:06:50 + Simon Andrews пишет: > > > download.file('https://seurat.nygenome.org/azimuth/references/homolo > > > gs.rds', destfile = "c:/Users/andrewss/homologs.rds", method="curl") <...> > > curl: (35) schannel: next InitializeSecurityContext > > failed: CRYPT_E_NO_REVOCATION_CHECK (0x80092012) - The revocation > > function was unable to check revocation for the certificate. > This extra error code is useful, thank you for trying the "curl" > method. https://github.com/curl/curl/issues/14315 suggests a libcurl option > and a curl command line option. > > Does download.file(method = 'curl', extra = '--ssl-no-revoke') work for you? Yes! Adding that option does indeed work and generates no warnings. > Since R-4.2.2, R understands the R_LIBCURL_SSL_REVOKE_BEST_EFFORT environment > variable. Does it help to set it > to "TRUE" (e.g. in the .Renviron file) > before invoking download.file(method = "libcurl")? Yes, this also works and will provide a workable solution for our environment > Sys.getenv("R_LIBCURL_SSL_REVOKE_BEST_EFFORT") [1] "TRUE" > download.file('https://seurat.nygenome.org/azimuth/references/homologs.rds', > destfile = "c:/Users/andrewss/homologs.rds", method="libcurl") trying URL 'https://seurat.nygenome.org/azimuth/references/homologs.rds' Content type 'application/octet-stream' length 3458249 bytes (3.3 MB) downloaded 3.3 MB Thank you so much for your help with this. I shall implement this for the rest of our organisation. Simon. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel