[R-pkg-devel] CRAN rules re. web scraping?

Thu Jan 23 02:48:43 CET 2020

Hello, All:

GOOD NEWS AND BAD NEWS:

       * First the good news:  I heard from Brian Ripley;  see below.  
His web site says, "He retired in August 2014 on grounds of ill health." 
(http://www.stats.ox.ac.uk/~ripley/)  I was pleased to see that he seems 
to be well enough to send me the email below.

       * BAD NEWS:  My Ecfun package is violating current CRAN rules 
regarding "not writing anywhere in the file space".  (See below.)

QUESTION:

       How do you suggest I respond to this?

       It's hard for me to fix, because I cannot replicate the error and 
I don't understand the rules Prof. Ripley is trying to enforce. The 
"CRAN Package Check Results for" this package show an error on 1 
platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms 
(Fedora-clang and Debian), and "OK" on 9 others.  I can program selected 
tests not to run on CRAN, e.g., with (!fda::CRAN()).

       However, I suspect I should be able to do better than that.

       Suggestions?

       Thanks,
       Spencer Graves

p.s.  The development version of this package is available at 
"https://github.com/sbgraves237/Ecfun".

https://cloud.r-project.org/web/checks/check_results_Ecfun.html

-------- Forwarded Message --------
Subject: 	CRAN package Ecfun
Date: 	Tue, 21 Jan 2020 21:26:02 +0000
From: 	Prof Brian Ripley <ripley using stats.ox.ac.uk>
Reply-To: 	CRAN <CRAN using r-project.org>
To: 	Spencer Graves <spencer.graves using effectivedefense.org>
CC: 	CRAN <CRAN using r-project.org>

This has been intermittently failing its checks for a week: different 
check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. 
The current failure is

Check: examples
Result: ERROR
Running examples in ‘Ecfun-Ex.R’ failed
The error most likely occurred in:

 > ### Name: read.testURLs
 > ### Title: Read a file produced by testURLs
 > ### Aliases: read.testURLs
 > ### Keywords: IO
 >
 > ### ** Examples
 >
 > # Test only 2 web sites, not the default 4,
 > # and test only twice, not the default 10 times:
 > tst <- testURLs(c(
+ PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index",
+ house="http://house.gov/representatives"),
+ n=2, maxFail=2)
1
1579634784, PVI, TRUE 0.828
1579634785, house, FALSE 0.051
1579634785, house, FALSE 0.048
2
1579634785, PVI, TRUE 0.043
1579634785, house, FALSE 0.11
1579634785, house, FALSE 0.035
 >
 > # The above should have created a file 'testURLresults.csv'
 > # in the working directory. Read it.
 >
 > dat <- read.testURLs()
Error in read.table(file = file, header = header, sep = sep, quote = 
quote, :
more columns than column names
Calls: read.testURLs -> read.csv -> read.table

That does not conform to the policy on Internet access, not least as no 
attempt is made to check if the file was created, let alone that it has 
the expected layout. Nor does it conform to the policy on not writing 
anywhere in the file space (and that shows on its CRAN results page too).

Please correct ASAP and before Feb 4 to safely retain the package on CRAN.

-- 
Brian D. Ripley,                  ripley using stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

	[[alternative HTML version deleted]]