[R-pkg-devel] URL checks

Greg Hunt greg @end|ng |rom ||rm@n@y@h@com
Thu Jun 30 02:00:53 CEST 2022


With a little more experimenting, the 503 response from the wiley DOI
lookup does seem to come from CloudFlare, there is a "server: cloudflare"
header.  Whether thats reliably present I have no idea, but its a starting
point.

On Thu, 30 Jun 2022 at 09:34, Greg Hunt <greg using firmansyah.com> wrote:

> With a little experimentation, the problem seems to be the -I switch in
> curl, with which the request uses the HTTP HEAD method instead of GET.
> Without -I, the requests from curl work for me; with -I, I get negative
> responses (403, 503).
>
> While HEAD does not represent a major security threat to a server it gets
> caught up when people are disabling unused or unnecessary operations and
> features and so at a first approximation, the problem is -I.  Now, there
> may also be scraper blocking applied to the CRAN and WinBuilder
> infrastructure by the CDN companies, because scraping is a large problem
> for many websites, but detecting rejection by cloudflare may be possible if
> that is what is happening.
>
>
> On Thu, 30 Jun 2022 at 07:08, Ivan Krylov <krylov.r00t using gmail.com> wrote:
>
>> On Wed, 29 Jun 2022 22:51:23 +0200 (CEST)
>> William Becker <william.becker using bluefoxdata.eu> wrote:
>>
>> > if someone can point me to a reference where I can work out how to
>> > solve the problem, that would be really helpful
>>
>> The CRAN URL checks page is linked from the CRAN policy:
>> https://cran.r-project.org/web/packages/URL_checks.html
>>
>> Short version is, your links all seem fine, but Cloudflare and other
>> content distribution companies don't like the way R checks them. You
>> only need to mention that in the submission comment.
>>
>> There's no technical fix for the problem. DOI checks could be in theory
>> adjusted (trading false negatives for false positives), but it's hard to
>> get checks working for the rest of the links when the CDN companies
>> decide that automated requests like ones produced by R CMD check are
>> exactly the kind of thing they should be blocking.
>>
>> --
>> Best regards,
>> Ivan
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list