[R-pkg-devel] CRAN rules re. web scraping?

Iñaki Ucar |uc@r @end|ng |rom |edor@project@org
Thu Jan 23 08:55:12 CET 2020

On Thu, 23 Jan 2020 at 02:49, Spencer Graves
<spencer.graves using effectivedefense.org> wrote:
> Hello, All:
>        * First the good news:  I heard from Brian Ripley;  see below.
> His web site says, "He retired in August 2014 on grounds of ill health."
> (http://www.stats.ox.ac.uk/~ripley/)  I was pleased to see that he seems
> to be well enough to send me the email below.
>        * BAD NEWS:  My Ecfun package is violating current CRAN rules
> regarding "not writing anywhere in the file space".  (See below.)
>        How do you suggest I respond to this?
>        It's hard for me to fix, because I cannot replicate the error and
> I don't understand the rules Prof. Ripley is trying to enforce. The
> "CRAN Package Check Results for" this package show an error on 1
> platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms
> (Fedora-clang and Debian), and "OK" on 9 others.  I can program selected
> tests not to run on CRAN, e.g., with (!fda::CRAN()).
>        However, I suspect I should be able to do better than that.
>        Suggestions?

The message from Prof. Ripley is crystal-clear, and exposes two issues
(Internet access, writing files) that have been discussed many times
in this list. A quick scan of the CRAN policy [1] yields:

- Packages which use Internet resources should fail gracefully with an
informative message if the resource is not available (and not give a
check warning nor error).

- Packages should not write in the user’s home filespace (including
clipboards), nor anywhere else on the file system apart from the R
session’s temporary directory.

[1] https://cran.r-project.org/web/packages/policies.html


>        Thanks,
>        Spencer Graves
> p.s.  The development version of this package is available at
> "https://github.com/sbgraves237/Ecfun".
> https://cloud.r-project.org/web/checks/check_results_Ecfun.html
> -------- Forwarded Message --------
> Subject:        CRAN package Ecfun
> Date:   Tue, 21 Jan 2020 21:26:02 +0000
> From:   Prof Brian Ripley <ripley using stats.ox.ac.uk>
> Reply-To:       CRAN <CRAN using r-project.org>
> To:     Spencer Graves <spencer.graves using effectivedefense.org>
> CC:     CRAN <CRAN using r-project.org>
> This has been intermittently failing its checks for a week: different
> check runs failed (in the 24h prior to) the 14th, 15th, 17th and today.
> The current failure is
> Check: examples
> Result: ERROR
> Running examples in ‘Ecfun-Ex.R’ failed
> The error most likely occurred in:
>  > ### Name: read.testURLs
>  > ### Title: Read a file produced by testURLs
>  > ### Aliases: read.testURLs
>  > ### Keywords: IO
>  >
>  > ### ** Examples
>  >
>  > # Test only 2 web sites, not the default 4,
>  > # and test only twice, not the default 10 times:
>  > tst <- testURLs(c(
> + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index",
> + house="http://house.gov/representatives"),
> + n=2, maxFail=2)
> 1
> 1579634784, PVI, TRUE 0.828
> 1579634785, house, FALSE 0.051
> 1579634785, house, FALSE 0.048
> 2
> 1579634785, PVI, TRUE 0.043
> 1579634785, house, FALSE 0.11
> 1579634785, house, FALSE 0.035
>  >
>  > # The above should have created a file 'testURLresults.csv'
>  > # in the working directory. Read it.
>  >
>  > dat <- read.testURLs()
> Error in read.table(file = file, header = header, sep = sep, quote =
> quote, :
> more columns than column names
> Calls: read.testURLs -> read.csv -> read.table
> That does not conform to the policy on Internet access, not least as no
> attempt is made to check if the file was created, let alone that it has
> the expected layout. Nor does it conform to the policy on not writing
> anywhere in the file space (and that shows on its CRAN results page too).
> Please correct ASAP and before Feb 4 to safely retain the package on CRAN.
> --
> Brian D. Ripley,                  ripley using stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
>         [[alternative HTML version deleted]]
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

Iñaki Úcar

More information about the R-package-devel mailing list