[Rd] saving objects with embedded environments
McGehee, Robert
Robert.McGehee at geodecapital.com
Fri Jun 29 00:30:39 CEST 2007
Hello,
I have been running linear regressions on large data sets. As 'lm' saves
a great deal of extraneous (for me) data including the residuals,
fitted.values, model frame, etc., I generally set these to NULL within
the object before saving off the model to a file.
In the below example, however, I have found that depending on whether or
not I run 'lm' within another function or not, the entire function
environment is saved off with the file. So, even while object.size and
all.equal report that both 'lm's are equal and of small size, one saves
as a 24MB file and the other as 646 bytes. These seems to be because in
the first example the function environment is saved in attr(x1$terms,
".Environment") and takes up all 24MB of space.
Anyway, I think this is a bug, or if nothing else very undesirable (that
an object reported to be 0.5kb takes up 24MB). There also seems to be
some inconsistency on how environments are saved depending on if it is
the global environment or not, though I'm not familiar enough with
environments to know if this was intentional. Comments are appreciated.
Thanks,
Robert
##################################################################
testEq <- function(B) {
x <- lm(y ~ x1+x2+x3, data=B, model=FALSE)
x$residuals <- x$effects <- x$fitted.values <- x$qr$qr <- NULL
x
}
N <- 900000
B <- data.frame(y=rnorm(N)+1:N, x1=rnorm(N)+1:N, x2=rnorm(N)+1:N,
x3=rnorm(N)+1:N)
x1 <- testEq(B)
x2 <- lm(y ~ x1+x2+x3, data=B, model=FALSE)
x2$residuals <- x2$effects <- x2$fitted.values <- x2$qr$qr <- NULL
all.equal(x1, x2) ## TRUE
object.size(x1) ## 5112
object.size(x2) ## 5112
save(x1, file="x1.RData")
save(x2, file="x2.RData")
file.info("x1.RData")$size ## 24063852 bytes
file.info("x2.RData")$size ## 646 bytes
> R.version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status
major 2
minor 5.0
year 2007
month 04
day 23
svn rev 41293
language R
version.string R version 2.5.0 (2007-04-23)
Robert McGehee, CFA
Quantitative Analyst
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Tel: 617/392-8396 Fax:617/476-6389
mailto:robert.mcgehee at geodecapital.com
This e-mail, and any attachments hereto, are intended for us...{{dropped}}
More information about the R-devel
mailing list