[Rd] Best way to handle dependency on non-CRAN package / large data package?

Thu Mar 12 16:41:52 CET 2015

I have just written a package called choroplethrZip
<https://github.com/arilamstein/choroplethrZip> which contains a shapefile
and metadata on US Zip codes. It is currently hosted on github, has a
tagged version number (v1.0.0) and passes R CMD check as verified by
Travis. My plan is to use this in the next version of my package choroplethr
<https://github.com/arilamstein/choroplethr>.

This is exactly what I have done in the past with other map/data packages
(notably choroplethrMaps <https://github.com/arilamstein/choroplethrMaps>
 and choroplethrAdmin1 <https://github.com/arilamstein/choroplethrAdmin1>),
and is the architecture that CRAN requested: large data in a separate
package, listing it in the 'Suggests', and putting code like this where
appropriate:

if (!requireNamespace("choroplethrAdmin1", quietly = TRUE)) {
  stop("Package choroplethrAdmin1 is needed for this function to work.
Please install it.", call. = FALSE)
}

The problem I now face is that choroplethrZip is too large to be hosted on
CRAN (~75MB), and I am unclear on the best way to manage this dependency.
Presumably I could just change the above message to say

Please install choropltherZip by typing:
    library(devtools)
    install_github('arilamstein/choroplethr at v1.0.0')

But I don't know if this is the best way to do this, or if there is
anything else to consider. I have never had to manage package dependencies
outside of CRAN, and have always thought of CRAN as being a "closed
ecosystem", where there were not any dependencies outside of CRAN.

Can anyone provide guidance on this?

Thanks.

Ari

	[[alternative HTML version deleted]]