[R-pkg-devel] how to prevent a small package from yielding a large installed size?

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Mon Jun 15 15:50:27 CEST 2020


I can't install your package (I don't have an up to date GDAL), but 
maybe this is some help:

- Package dependencies aren't included, except possibly for static 
linking of C/Fortran/C++ code.  Those normally won't end up in an .rdb file.

- .rdb files are part of the lazy load mechanism.  You can read the 
corresponding .rdx file using readRDS(); it contains information on 
where to look in the .rdb file to find the source of an object.  For 
example, if you have foo.rdb and foo.rdx, then this will tell you what's 
big in your foo.rdb file:

rdx <- readRDS("foo.rdx")
sizes <- sapply(rdx$variables, function(n) n[2])

Now sizes will be a named vector of objects contained in the rdb.  You 
should find that sum(sizes) is similar to the size of the .rdb file, but 
probably a bit smaller, because there are some objects missed by this 
count:  the ones contained in rdx$references.

Duncan Murdoch


On 15/06/2020 7:13 a.m., Daniel Kelley wrote:
> I am working on a package (https://github.com/ArgoCanada/argoFloats) that has a 412K source tarball (most of which is data; the R code is 176K), but that creates a library .rdb file of MUCH larger size, namely 7.2M.  This file causes a build NOTE, being over the threshold of 1M, and that concerns me in terms of hoped-for submission to CRAN during this summer.
> 
> My goal in writing this email is to get some advice regarding reducing the size of the .rds file, if indeed this is a general problem and not an artifact of my (macOS) development environment.
> 
> Here's some more detail:
> 
> argoFloats depends on some other packages, and so I am wondering whether the large multiplier between R source and .rdb file is because the other sources are dragged in.  I could try moving everything to "Suggests", and use requireNamespace(), but that seems to go against recommendations, if I interpret Wickham and Bryan (https://r-pkgs.org/description.html) correctly.
> 
> A possible clue is that I get a large-file note on macOS, but not when I use rhub for test linux builds, or winbuilder for a windows build.  I do not have ready access to either linux or windows machines, to examine those builds in detail.
> 
> My thinking is that examination of the .rdb file might help me to learn about problems (e.g. if it holds code from packages I "import" from, that might motivate me to move from "import" to "suggest"). Unfortunately, I have not been able to discover a way to examine that file, which seems to be designed for internal R use.
> 
> I am attaching below my signature line the output from sessionInfo(), in case that helps.  The URL I reference in my second paragraph has my DESCRIPTION file, and I will admit that I do not fully understand its nuances.  Note that I use roxygen2 to build documentation and NAMESPACE.
> 
> Any advice would be greatly appreciated, and indeed I thank anyone who got to the bottom of this long email.
> 
> Dan E. Kelley [he/him/his 314ppm]
> Department of Oceanography
> Dalhousie University
> Halifax, NS, Canada
> 
> 
> 
> R version 4.0.1 (2020-06-06)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Catalina 10.15.6
> 
> Matrix products: default
> BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> 
> locale:
> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] argoFloats_0.1.3
> 
> loaded via a namespace (and not attached):
>   [1] Rcpp_1.0.4.6        pillar_1.4.4        compiler_4.0.1      plyr_1.8.6          class_7.3-17
>   [6] tools_4.0.1         testthat_2.3.2      digest_0.6.25       bit_1.1-15.2        ncdf4_1.17
> [11] oce_1.2-1           memoise_1.1.0       RSQLite_2.2.0       lifecycle_0.2.0     tibble_3.0.1
> [16] gtable_0.3.0        lattice_0.20-41     gsw_1.0-6           pkgconfig_2.0.3     rlang_0.4.6
> [21] DBI_1.1.0           rstudioapi_0.11     curl_4.3            e1071_1.7-3         dplyr_1.0.0
> [26] stringr_1.4.0       raster_3.1-5        generics_0.0.2      vctrs_0.3.1         classInt_0.4-3
> [31] bit64_0.9-7         grid_4.0.1          tidyselect_1.1.0    glue_1.4.1          sf_0.9-4
> [36] R6_2.4.1            sp_1.4-2            marmap_1.0.4        adehabitatMA_0.3.14 blob_1.2.1
> [41] ggplot2_3.3.1       purrr_0.3.4         reshape2_1.4.4      magrittr_1.5        units_0.6-6
> [46] scales_1.1.1        codetools_0.2-16    ellipsis_0.3.1      shape_1.4.4         colorspace_1.4-1
> [51] KernSmooth_2.23-17  stringi_1.4.6       munsell_0.5.0       crayon_1.3.4.9000
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>



More information about the R-package-devel mailing list