[R-pkg-devel] Scrapping R CRAN website from package

Fri Jul 16 12:13:50 CEST 2021

Dear Sir or Madam,

I am creating a new package `pacs` https://github.com/Polkas/pacs, which
I want to send to R CRAN shortly. However I am not sure about R
CRAN policy regarding scraping CRAN per package page with its archive.
More precisely I am fetching the data from
https://CRAN.R-project.org/package=%s and
https://cran.r-project.org/src/contrib/Archive/%s/ (downloading an old
tar.gz too).

Why I need this: I could read any DESCRIPTION files for any time point and
get a true dependency tree.  Moreover I could get a life duration of any
released package version, where shorter than 7 days are marked as risky. I
could compare a package min required dependencies difference before we
update it.  And much more.

I made a few notices like "Please as a courtesy to the R CRAN, don't
overload their server by constantly using this function." inside the
package.

Optionally If scrapping R CRAN from my package is a problem I will try to
build a separate DB with such data (updated everyday). Still any old tar.gz
has to be downloaded.

Maciej Nasinski, University of Warsaw

	[[alternative HTML version deleted]]