>>>>> Gábor Csárdi >>>>> on Wed, 22 Jan 2020 22:56:17 +0000 writes:
> Hi All, > I think there is a memory error in the libcurl connection code that > typically happens when libcurl reads big chunks of data. This > potentially affects all code that use url() with the libcurl download > method, which is the default in most builds. In practice it tends to > happen more with HTTP/2 and if the connection is wrapped into a > gzcon(). macOS Catalina has a libcurl build with HTTP/2 error, so many > users that upgraded macOS are starting to see this. > The workaround is to avoid using url(), if you can. If you need an > HTTP stream, you can use curl::curl(), which is a drop-in replacement. > To reproduce, the easiest is a libcurl build that has HTTP/2 support > and a server with HTTP/2 as well, e.g. the cloud mirror: > ------------------------------------------------ > ~ # R --slave -e 'options(internet.info = 0); foo <- > readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))' > * Trying 13.33.54.118:443... > * TCP_NODELAY set > * Connected to cran.rstudio.com (13.33.54.118) port 443 (#0) > * ALPN, offering h2 > * ALPN, offering http/1.1 > * successfully set certificate verify locations: > * CAfile: /etc/ssl/certs/ca-certificates.crt > CApath: none > * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 > * ALPN, server accepted to use h2 > * Server certificate: > * subject: CN=cran.rstudio.com > * start date: Jul 24 00:00:00 2019 GMT > * expire date: Aug 24 12:00:00 2020 GMT > * subjectAltName: host "cran.rstudio.com" matched cert's "cran.rstudio.com" > * issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon > * SSL certificate verify ok. > * Using HTTP2, server supports multi-use > * Connection state changed (HTTP/2 confirmed) > * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 > * Using Stream ID: 1 (easy handle 0x56303c2910e0) >> GET /src/contrib/Meta/archive.rds HTTP/2 > Host: cran.rstudio.com > User-Agent: R (3.4.4 x86_64-pc-linux-gnu x86_64 linux-gnu) > Accept: */* > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)! > < HTTP/2 200 > < content-length: 2483432 > < date: Wed, 22 Jan 2020 21:22:04 GMT > < server: Apache/2.4.39 (Unix) > < last-modified: Wed, 22 Jan 2020 17:10:22 GMT > < etag: "25e4e8-59cbd998a0360" > < accept-ranges: bytes > < cache-control: max-age=1800 > < expires: Wed, 22 Jan 2020 21:52:04 GMT > < x-cache: Hit from cloudfront > < via: 1.1 6cbe48f9f9ff0c768f29d83804f75d4c.cloudfront.net (CloudFront) > < x-amz-cf-pop: MAN50-C1 > < x-amz-cf-id: WwCQVQz9g8ZP6Az4m4n__h7aUW6vwlg0-AkiCv_DnVfGe10bzaFtfg== > < age: 960 > < > * 85 data bytes written > Error in readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds"))) > : > reference index out of range > * stopped the pause stream! > * Connection #0 to host cran.rstudio.com left intact > Execution halted > ------------------------------------------------ > Sometimes you get a crash, sometimes a corrupt stream, etc. Sometimes > is actually works. > It seems that the fix is simply this: > ------------------------------------ > --- src/modules/internet/libcurl.c~ > +++ src/modules/internet/libcurl.c > @@ -762,6 +762,7 @@ > void *newbuf = realloc(ctxt->buf, newbufsize); > if (!newbuf) error("Failure in re-allocation in rcvData"); ctxt-> buf = newbuf; ctxt->bufsize = newbufsize; > + ctxt->current = ctxt->buf; > } > memcpy(ctxt->buf + ctxt->filled, ptr, add); > ------------------------------------ > Best, > Gabor Thanks a lot, Gábor! I can reproduce the problem (on Linux Fedora 30) and confirm that your patch works. Even more, the patch looks "almost obvious", because ctxt->current = ctxt->buf happens earlier in rcvData() after a change to ctxt->buf and so should be updated if buf is. An even slightly "better" patch just moves that statement down to after the if(add) { .. } clause. I'll patch the sources, and will port to 'R 3.6.2 patched'. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel