[Rd] CRAN package sizes

Yihui Xie xie at yihui.name
Sun Feb 13 22:02:32 CET 2011

Regarding the reasons that make the doc directory large, I wonder if
we can make some changes in R:

1. Use a null graphics device as the default device rather than pdf()
when running Sweave -- this can avoid the useless Rplots.pdf:

options(device = function(...) {
    .Call("R_GD_nullDevice", PACKAGE = "grDevices")

This can save some time in building the vignette(s) as well. (see

However, this undocumented null device may not work for certain
graphics. Here is an example that it fails for ggplot2:

Is it possible for someone to look into the null device (Dr Murrell?)
to make it stable enough?

2. Compress the PDF graphics and vignettes using third-party tools,
among which I recommend qpdf (it's free).

qpdf --stream-data=compress input.pdf output.pdf

This can reduce the size of PDF files a lot without quality loss. I'm
using this tool in the animation package to reduce the size of PDF

3. Sorry I bring up this issue again, but I don't understand why
Sweave could not implement the png() device along with pdf() and
postscript(). I'm willing to provide a patch if needed.


Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

On Sun, Feb 13, 2011 at 6:30 AM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
> Robin Hankin's post reminded me to post about the following recent addition
> to 'Writing R Extensions', in the section on 'Submitting a package to CRAN'
>  Ensure that the package sources are not unnecessarily large. ...
>  As a general rule, doc directories should not exceed 5Mb, and
>  where data directories need to be 10Mb or more, consideration should
>  be given to a separate package containing just the data. (Similarly
>  for external data directories, large jar files and other libraries
>  that need to be installed.)
> With 2800 packages on CRAN, overall size is becoming a concern and currently
> to install all of CRAN takes 4Gb.  As the attached (I hope) graph shows, the
> 20 packages over 20Mb take a quarter, and those over 5Mb take half.  (And
> this is after we have removed 100Mb from the largest installed package by
> re-compression, and archived the second largest, so Robin's package is
> currently the largest.)  Some of the largest packages are data/jar packages,
> but there are 55 packages with 'doc' directories over 5Mb.  To put that in
> perspective, PDFs of whole books with lots of figures (MASS, Paul's R
> Graphics) are well under 5Mb.
> R CMD check in R-devel reports on large packages, and expect in future that
> submitted package sizes will be questioned more often.
> There are lots of different reasons why doc directories are large, but the
> major ones are
> - installing files that are unneeded, such as Rplots.pdf and .eps
>  figures.
> - using PDF figures of images where PNG would be more appropriate.
> - including less than relevant material (such as how to install R,
>  with screenshots!)
> There are several ways to reduce the sizes of PDFs with no loss in quality,
> e.g. Adobe Acrobat Standard/Pro.
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list