[R] .Rdata files -- fortune?

Jan Kim jttkim at googlemail.com
Thu Mar 12 18:22:20 CET 2015


Dear Petr, dear All,

On Thu, Mar 12, 2015 at 06:38:40AM +0000, PIKAL Petr wrote:
> Hi
> 
> > -----Original Message-----
> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jan Kim
> > Sent: Wednesday, March 11, 2015 4:44 PM
> > To: r-help at r-project.org
> > Subject: Re: [R] .Rdata files -- fortune?
> >
> > On Wed, Mar 11, 2015 at 09:00:15AM -0400, Prof J C Nash (U30A) wrote:
> > > Well put. I avoid them too, and go so far as to seek and destroy so
> > > they don't get loaded unnoticed and cause unwanted consequences.
> > >
> > > ".RData files (the ones with nothing before the period) are just
> > traps
> > > for your future self, with no documentation. I avoid them like the
> > plague."
> >
> > I absolutely agree. While I've solved the issue for myself long ago by
> > always putting something like
> >
> >     alias R='R --no-save --no-restore'
> >
> > into my startup scripts (~/.bashrc or the like), I've seen too many
> > others caught out by implicit saving / restoring of workspaces (e.g.
> > by somehow just accepting that R "only works properly in this
> > particular directory" and therefore doing all their work there at the
> > cost of adopting various anti-patterns with respect to organising work
> > into directories.
> >
> > Personally I think that auto saving / restoring workspaces should be
> > reviewed, as it can, in practice, make it harder for people to render
> > their work in a self-contained and reproducible way.
> 
> If this is considered I would beg for addind an option to keep autosave work for those who have different approach. If you keep the paradigm one project = one separate directory there shall be no problem with autosaving as you have only one Rdata file together with exported pictures, pdfs, xls and doc files.
> 
> If you save history to separate files you can also easily keep track of your work. If autosave is disabled and you could leave your session without warning I bet that there would be hundereds of questions similar to:
> 
> I worked whole day and after quitting R all my work is lost.

As the auto save / restore feature has been around for several years,
it sure makes sense to withdraw it in a rather gradual process. For
example, as a first step a flag distinguishing autosaved workspaces
from those generated by the user calling save.image could be added
to the workspace file format, subsequently users could be warned about
auto-saved workspaces increasingly prominently, users could be pointed
to ways to customising their startup / exit setup to arrange for
autosave / restore if they really want to retain it (via ~/.Rprofile
and some way to register functions to be invoked upon terminating an
interactive session), and after all those stages the feature could
be withdrawn.

It's a long process but I think it would be worthwhile because it will
improve reproducibility of scientific computing. As an illustration,
one of the patterns how I've seen people becoming dependent on .RData
files is writing a function that references a global variable. The
function may work (in the sense of running without causing an error)
for many months in the directory with the .RData file "providing" that
variable, and when the user finally tries to use the function in an
R process started in another directory, they are mystified and may well
start doing all their work in the one "magical" directory.

Obviously, if that global variable is ever changed, all results generated
previously will no longer be reproducible. And yes, I have found a
function once with a loop "for (i in 1:n)", where "n" was a parameter
that since had been changed to something more descriptive and the error
remained unnoticed because there was a global variable "n" in the
workspace -- and the number of iterations of that loop was controlled
by that, rather than by the parameter of the function.

Saving workspaces as a cache can be a very useful and entirely sensible
thing (as Jeff wrote previously), but if this happens automatically,
this means it can happen where it's not so sensible, and some stuff left
in workspaces accidentally and innocently may turn into a landmine in
the future.

Best regards, Jan

> Cheers
> Petr
> 
> >
> > Best regards, Jan
> >
> > >
> > > JN
> > >
> > > On 15-03-11 07:00 AM, r-help-request at r-project.org wrote:
> > > > Message: 34
> > > > Date: Tue, 10 Mar 2015 17:51:15 -0700
> > > > From: Jeff Newmiller <jdnewmil at dcn.davis.CA.us>
> > > > To: Rolf Turner <r.turner at auckland.ac.nz>, Erin Hodgess
> > > >   <erinm.hodgess at gmail.com>, R help <r-help at stat.math.ethz.ch>
> > > > Subject: Re: [R] .Rprofile vs. First (more of an opinion question)
> > > > Message-ID: <E5A53229-B271-42D9-BEAB-73142B2F62F4 at dcn.davis.CA.us>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > I concur with Rolf.
> > > >
> > > > .RData files (the ones with nothing before the period) are just
> > traps for your future self, with no documentation. I avoid them like
> > the plague. I refer to specifically-named Something.RData files in my
> > .R/.Rnw/.Rmd files to cache results of long computations, but they are
> > optional in my workflow because I always have R code that can
> > regenerate them.
> > > >
> > > > .Rprofile files offer consistency of behavior  regardless of which
> > working directory you use, and you can comment them.
> > > > -------------------------------------------------------------------
> > --------
> > > > Jeff Newmiller                        The     .....       .....  Go
> > Live...
> > > > DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.
> > Live Go...
> > > >                                       Live:   OO#.. Dead: OO#..
> > Playing
> > > > Research Engineer (Solar/Batteries            O.O#.       #.O#.
> > with
> > > > /Software/Embedded Controllers)               .OO#.       .OO#.
> > rocks...1k
> > > > -------------------------------------------------------------------
> > -
> > > > ------- Sent from my phone. Please excuse my brevity.
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> >  +- Jan T. Kim -------------------------------------------------------+
> >  |             email: jttkim at gmail.com                                |
> >  |             WWW:   http://www.jtkim.dreamhosters.com/              |
> >  *-----=<  hierarchical systems are for files, not for humans  >=-----*
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ________________________________
> Tento e-mail a jak??koliv k n??mu p??ipojen?? dokumenty jsou d??v??rn?? a jsou ur??eny pouze jeho adres??t??m.
> Jestli??e jste obdr??el(a) tento e-mail omylem, informujte laskav?? neprodlen?? jeho odes??latele. Obsah tohoto emailu i s p????lohami a jeho kopie vyma??te ze sv??ho syst??mu.
> Nejste-li zam????len??m adres??tem tohoto emailu, nejste opr??vn??ni tento email jakkoliv u????vat, roz??i??ovat, kop??rovat ??i zve??ej??ovat.
> Odes??latel e-mailu neodpov??d?? za eventu??ln?? ??kodu zp??sobenou modifikacemi ??i zpo??d??n??m p??enosu e-mailu.
> 
> V p????pad??, ??e je tento e-mail sou????st?? obchodn??ho jedn??n??:
> - vyhrazuje si odes??latel pr??vo ukon??it kdykoliv jedn??n?? o uzav??en?? smlouvy, a to z jak??hokoliv d??vodu i bez uveden?? d??vodu.
> - a obsahuje-li nab??dku, je adres??t opr??vn??n nab??dku bezodkladn?? p??ijmout; Odes??latel tohoto e-mailu (nab??dky) vylu??uje p??ijet?? nab??dky ze strany p????jemce s dodatkem ??i odchylkou.
> - trv?? odes??latel na tom, ??e p????slu??n?? smlouva je uzav??ena teprve v??slovn??m dosa??en??m shody na v??ech jej??ch n??le??itostech.
> - odes??latel tohoto emailu informuje, ??e nen?? opr??vn??n uzav??rat za spole??nost ????dn?? smlouvy s v??jimkou p????pad??, kdy k tomu byl p??semn?? zmocn??n nebo p??semn?? pov????en a takov?? pov????en?? nebo pln?? moc byly adres??tovi tohoto emailu p????padn?? osob??, kterou adres??t zastupuje, p??edlo??eny nebo jejich existence je adres??tovi ??i osob?? j??m zastoupen?? zn??m??.
> 
> This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
> If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.
> 
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
> - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.

-- 
 +- Jan T. Kim -------------------------------------------------------+
 |             email: jttkim at gmail.com                                |
 |             WWW:   http://www.jtkim.dreamhosters.com/              |
 *-----=<  hierarchical systems are for files, not for humans  >=-----*



More information about the R-help mailing list