Sorry, Martin, but I've NOT commented on this matter, unless someone has been impersonating me. Someone else?
JN On 2021-01-11 4:51 a.m., Martin Maechler wrote: >>>>>> Viechtbauer, Wolfgang (SP) >>>>>> on Fri, 8 Jan 2021 13:50:14 +0000 writes: > > > Instead of a separate file to store such a list, would it be an idea to > add versions of the \href{}{} and \url{} markup commands that are skipped by > the URL checks? > > Best, > > Wolfgang > > I think John Nash and you misunderstood -- or then I > misunderstood -- the original proposal: > > I've been understanding that there should be a "central repository" of URL > exceptions that is maintained by volunteers. > > And rather *not* that package authors should get ways to skip > URL checking.. > > Martin > > > >> -----Original Message----- > >> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of > Spencer > >> Graves > >> Sent: Friday, 08 January, 2021 13:04 > >> To: r-devel@r-project.org > >> Subject: Re: [Rd] URL checks > >> > >> I also would be pleased to be allowed to provide "a list of known > >> false-positive/exceptions" to the URL tests. I've been challenged > >> multiple times regarding URLs that worked fine when I checked them. We > >> should not be required to do a partial lobotomy to pass R CMD check ;-) > >> > >> Spencer Graves > >> > >> On 2021-01-07 09:53, Hugo Gruson wrote: > >>> > >>> I encountered the same issue today with > https://astrostatistics.psu.edu/. > >>> > >>> This is a trust chain issue, as explained here: > >>> https://whatsmychaincert.com/?astrostatistics.psu.edu. > >>> > >>> I've worked for a couple of years on a project to increase HTTPS > >>> adoption on the web and we noticed that this type of error is very > >>> common, and that website maintainers are often unresponsive to > requests > >>> to fix this issue. > >>> > >>> Therefore, I totally agree with Kirill that a list of known > >>> false-positive/exceptions would be a great addition to save time to > both > >>> the CRAN team and package developers. > >>> > >>> Hugo > >>> > >>> On 07/01/2021 15:45, Kirill Müller via R-devel wrote: > >>>> One other failure mode: SSL certificates trusted by browsers that are > >>>> not installed on the check machine, e.g. the "GEANT Vereniging" > >>>> certificate from https://relational.fit.cvut.cz/ . > >>>> > >>>> K > >>>> > >>>> On 07.01.21 12:14, Kirill Müller via R-devel wrote: > >>>>> Hi > >>>>> > >>>>> The URL checks in R CMD check test all links in the README and > >>>>> vignettes for broken or redirected links. In many cases this > improves > >>>>> documentation, I see problems with this approach which I have > >>>>> detailed below. > >>>>> > >>>>> I'm writing to this mailing list because I think the change needs to > >>>>> happen in R's check routines. I propose to introduce an "allow-list" > >>>>> for URLs, to reduce the burden on both CRAN and package maintainers. > >>>>> > >>>>> Comments are greatly appreciated. > >>>>> > >>>>> Best regards > >>>>> > >>>>> Kirill > >>>>> > >>>>> # Problems with the detection of broken/redirected URLs > >>>>> > >>>>> ## 301 should often be 307, how to change? > >>>>> > >>>>> Many web sites use a 301 redirection code that probably should be a > >>>>> 307. For example, https://www.oracle.com and https://www.oracle.com/ > >>>>> both redirect to https://www.oracle.com/index.html with a 301. I > >>>>> suspect the company still wants oracle.com to be recognized as the > >>>>> primary entry point of their web presence (to reserve the right to > >>>>> move the redirection to a different location later), I haven't > >>>>> checked with their PR department though. If that's true, the > redirect > >>>>> probably should be a 307, which should be fixed by their IT > >>>>> department which I haven't contacted yet either. > >>>>> > >>>>> $ curl -i https://www.oracle.com > >>>>> HTTP/2 301 > >>>>> server: AkamaiGHost > >>>>> content-length: 0 > >>>>> location: https://www.oracle.com/index.html > >>>>> ... > >>>>> > >>>>> ## User agent detection > >>>>> > >>>>> twitter.com responds with a 400 error for requests without a user > >>>>> agent string hinting at an accepted browser. > >>>>> > >>>>> $ curl -i https://twitter.com/ > >>>>> HTTP/2 400 > >>>>> ... > >>>>> <body>...<p>Please switch to a supported browser...</p>...</body> > >>>>> > >>>>> $ curl -s -i https://twitter.com/ -A "Mozilla/5.0 (X11; Ubuntu; > Linux > >>>>> x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" | head -n 1 > >>>>> HTTP/2 200 > >>>>> > >>>>> # Impact > >>>>> > >>>>> While the latter problem *could* be fixed by supplying a > browser-like > >>>>> user agent string, the former problem is virtually unfixable -- so > >>>>> many web sites should use 307 instead of 301 but don't. The above > >>>>> list is also incomplete -- think of unreliable links, HTTP links, > >>>>> other failure modes... > >>>>> > >>>>> This affects me as a package maintainer, I have the choice to either > >>>>> change the links to incorrect versions, or remove them altogether. > >>>>> > >>>>> I can also choose to explain each broken link to CRAN, this subjects > >>>>> the team to undue burden I think. Submitting a package with NOTEs > >>>>> delays the release for a package which I must release very soon to > >>>>> avoid having it pulled from CRAN, I'd rather not risk that -- hence > I > >>>>> need to remove the link and put it back later. > >>>>> > >>>>> I'm aware of https://github.com/r-lib/urlchecker, this alleviates > the > >>>>> problem but ultimately doesn't solve it. > >>>>> > >>>>> # Proposed solution > >>>>> > >>>>> ## Allow-list > >>>>> > >>>>> A file inst/URL that lists all URLs where failures are allowed -- > >>>>> possibly with a list of the HTTP codes accepted for that link. > >>>>> > >>>>> Example: > >>>>> > >>>>> https://oracle.com/ 301 > >>>>> https://twitter.com/drob/status/1224851726068527106 400 > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel