[Bioc-devel] lazyData

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Mon Jul 18 16:52:00 CEST 2016


This is a report on my testing with lazyData turned on and off wrt.
installation time and memory requirements.  It turns out that using
lazyData dramatically increases memory consumption and time for a
(admittedly large) annotation package.  Perhaps this is something we should
think about wrt. annotation and data packages.

Test example is
  IlluminaHumanMethylationEPICanno.ilm10b2.hg19
an annotation package for minfi.  The .tar.gz for the this package is 113
so its not small.

I have explored using
  LazyData: yes/no in DESCRIPTION
  adding a single line data/datalist file containing the objects in the
package

What follows are timings and memory consumption of R CMD build + INSTALL on
my Mac laptop using an SSD drive.


  LazyData: yes
  datalist: no
  285 seconds
  3.22 GB (values as high as 3.8GB seen)

  LazyData: no
  datalist: no
  81s
  1.64 GB

  LazyData: no
  datalist: yes
  19s
  0.38 GB

(following combination is not mentioned by R-exts, and while it still uses
tons of memory, it seems to be 1 minute faster; redid measuring once to
confirm this)
  LazyData: yes
  datalist: yes
  226 s
  3.26 GB (values as high as 3.9GB seen)

Make the data LazyLoaded is pretty nice; one thing is it avoids polluting
the global environment.

But it seems that it would be worthwhile to consider if some of this could
be done prior to the package build time.  Perhaps not, but for sure we are
spending resources on the building and installing of this by the build
system.

I started going down this route because my Travis build starting being
killed due to 3+GB being used. I really don't like turning off LazyLoad
because of the global environment issue, but the number are kind of extreme
here.

Best,
Kasper

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list