In versions of R prior to 3.6.0 the following invocation succeeds, returning the data frame shown:
> read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text", > header=TRUE) Dekade Anzahl 1 1900 11467254 2 1910 13023370 3 1920 13434601 4 1930 13296355 5 1940 12121250 6 1950 13191131 7 1960 10587420 8 1970 10944129 9 1980 11279439 10 1990 12052652 But in version 3.6.0 it fails: > read.table("https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text", > header=TRUE) Error in file(file, "rt") : cannot open the connection to 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text' In addition: Warning message: In file(file, "rt") : cannot open URL 'https://www.dwds.de/r/stat?corpus=kern&cnt=tokens&date=decade&format=text': HTTP status was '403 Forbidden' The table at this URL is generated by a query processor and the same failure happens in 3.6.0 with other queries at this website. This website does not appear to serve data via http: replacing https by http in the above gives the same results, and in 3.6.0 the error message contains the URL with http but in the warning message the URL is with https. I have also tried a few other websites that serve (non-generated) tabular data via https (e.g. https://graphchallenge.s3.amazonaws.com/synthetic/gc3/Theory-16-25-81-Bk.tsv) and with these read.table() succeeds in 3.6.0, so the problem isn't https in general. Maybe it has to do with the page being generated rather than static? There's only one reference to https in the 3.6.0 NEWS, concerning libcurl; I can't tell if it's relevant. In case it matters, this is with R packaged for openSUSE, and I've found the above difference between 3.5 and 3.6 on both openSUSE Leap 15.0 and openSUSE Tumbleweed. Steve Berman ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel