[Bioc-devel] lazyData (Kasper Daniel Hansen)

Alex Pickering alexvpickering at gmail.com
Sat Jul 30 18:44:29 CEST 2016


I was experiencing similar issues of long build times with LazyData TRUE so
I made it FALSE.

You can specify the environment in which data is loaded (it doesn't have to
be the global environment). For example, to load the cmap_es data from the
ccdata package call this from within a function:

utils::data("cmap_es", package = "ccdata", envir = environment())

It will get loaded into the function's environment (not global). You may
also need to add a `cmap_es = NULL` prior to loading it otherwise the build
process will complain about not declaring the cmap_es variable before using
it.

On Sat, Jul 30, 2016 at 3:00 AM, <bioc-devel-request at r-project.org> wrote:

> Send Bioc-devel mailing list submissions to
>         bioc-devel at r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://stat.ethz.ch/mailman/listinfo/bioc-devel
> or, via email, send a message with subject or body 'help' to
>         bioc-devel-request at r-project.org
>
> You can reach the person managing the list at
>         bioc-devel-owner at r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioc-devel digest..."
>
>
> Today's Topics:
>
>    1. Re: lazyData (Kasper Daniel Hansen)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 29 Jul 2016 16:26:45 -0400
> From: Kasper Daniel Hansen <kasperdanielhansen at gmail.com>
> To: Martin Morgan <martin.morgan at roswellpark.org>
> Cc: "bioc-devel at r-project.org" <bioc-devel at r-project.org>
> Subject: Re: [Bioc-devel] lazyData
> Message-ID:
>         <CAC2h7ut9B9UFfAxrD1OiEXgV-AkWMd=
> SsiTRCJqWf7zfwBgSZg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> With LazyData true you indeed don't load the data until it is available.
> My guess, from skimming the code extremely fast, is that the extreme
> requirements (memory and time) during installation is because the data
> objects needs to get loaded and somehow modified for this to happen.
>
> Re. the global environment: if my package has an object TEST, and LazyData
> is TRUE, when I do (say)
>   data(TEST)
> or use TEST somehow, TEST doesn't exists in the Global environment.  But if
> LazyData is FALSE and I do data(TEST), TEST gets copied into the Global
> environment, which is kind of irritating when it is annotation data because
> it seems fragile to me (perhaps it is not).
>
> Best,
> Kasper
>
> On Fri, Jul 29, 2016 at 3:38 PM, Martin Morgan <
> martin.morgan at roswellpark.org> wrote:
>
> > On 07/18/2016 10:52 AM, Kasper Daniel Hansen wrote:
> >
> >> This is a report on my testing with lazyData turned on and off wrt.
> >> installation time and memory requirements.  It turns out that using
> >> lazyData dramatically increases memory consumption and time for a
> >> (admittedly large) annotation package.  Perhaps this is something we
> >> should
> >> think about wrt. annotation and data packages.
> >>
> >> Test example is
> >>   IlluminaHumanMethylationEPICanno.ilm10b2.hg19
> >> an annotation package for minfi.  The .tar.gz for the this package is
> 113
> >> so its not small.
> >>
> >> I have explored using
> >>   LazyData: yes/no in DESCRIPTION
> >>   adding a single line data/datalist file containing the objects in the
> >> package
> >>
> >> What follows are timings and memory consumption of R CMD build + INSTALL
> >> on
> >> my Mac laptop using an SSD drive.
> >>
> >>
> >>   LazyData: yes
> >>   datalist: no
> >>   285 seconds
> >>   3.22 GB (values as high as 3.8GB seen)
> >>
> >>   LazyData: no
> >>   datalist: no
> >>   81s
> >>   1.64 GB
> >>
> >>   LazyData: no
> >>   datalist: yes
> >>   19s
> >>   0.38 GB
> >>
> >
> > Hi Kasper -- I have to admit my ignorance on the miracle of lazy data.
> Can
> > you clarify what one gains from LazyData? I kind of though that with
> > LazyData: true the data was only loaded when needed, but that doesn't
> seem
> > consistent with the picture you paint above? Also, what's the discussion
> > about global variables?
> >
> > Martin
> >
> >
> >> (following combination is not mentioned by R-exts, and while it still
> uses
> >> tons of memory, it seems to be 1 minute faster; redid measuring once to
> >> confirm this)
> >>   LazyData: yes
> >>   datalist: yes
> >>   226 s
> >>   3.26 GB (values as high as 3.9GB seen)
> >>
> >> Make the data LazyLoaded is pretty nice; one thing is it avoids
> polluting
> >> the global environment.
> >>
> >> But it seems that it would be worthwhile to consider if some of this
> could
> >> be done prior to the package build time.  Perhaps not, but for sure we
> are
> >> spending resources on the building and installing of this by the build
> >> system.
> >>
> >> I started going down this route because my Travis build starting being
> >> killed due to 3+GB being used. I really don't like turning off LazyLoad
> >> because of the global environment issue, but the number are kind of
> >> extreme
> >> here.
> >>
> >> Best,
> >> Kasper
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> Bioc-devel at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >>
> >>
> >
> > This email message may contain legally privileged and/or confidential
> > information.  If you are not the intended recipient(s), or the employee
> or
> > agent responsible for the delivery of this message to the intended
> > recipient(s), you are hereby notified that any disclosure, copying,
> > distribution, or use of this email message is prohibited.  If you have
> > received this message in error, please notify the sender immediately by
> > e-mail and delete this email message from your computer. Thank you.
> >
>
>         [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Bioc-devel mailing list
> Bioc-devel at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> ------------------------------
>
> End of Bioc-devel Digest, Vol 148, Issue 38
> *******************************************
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list