[Rd] robust updating methods
Ben Bolker
bbolker at gmail.com
Sun Mar 22 17:45:53 CET 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
WARNING: this is long. Sorry I couldn't find a way to compress it.
Is there a reasonable way to design an update method so that it's
robust to a variety of reasonable use cases of generating calls or
data inside or outside a function? Is it even possible? Should I
just tell users "don't do that"?
* `update.default()` uses `eval(call, parent.frame())`; this fails
when the call depends on objects that were defined in a different
environment (e.g., when the data are generated and the model
initially fitted within a function scope)
* an alternative is to store the original environment in which the
fitting is done in the environment of the formula and use
`eval(call, env=environment(formula(object)))`; this fails if the
user tries to update the model originally fitted outside a function
with data modified within a function ...
* I think I've got a hack that works below, which first tries in the
environment of the formula and falls back to the parent frame if
that fails, but I wonder if I'm missing something much simpler ..
Thoughts? My understanding of environments and frames is still,
after all these years, not what it should be ...
I've thought of some other workarounds, none entirely satisfactory:
* force evaluation of all elements in the original call
* printing components of the call can get ugly (can save the
original call before evaluating)
* large objects in the call get duplicated
* don't use `eval(call)` for updates; instead try to store everything
internally
* this works OK but has the same drawback of potentially storing
large extra copies
* we could try to use the model frame (which is stored already),
but there are issues with this (the basis of a whole separate rant)
because the model frame stores something in between predictor
variables and input variables. For example
d <- data.frame(y=1:10,x=runif(10))
names(model.frame(lm(y~log(x),data=d)))
## "y" "log(x)"
So if we wanted to do something like update to "y ~ sqrt(x)",
it wouldn't work ...
==================
update.envformula <- function(object,...) {
extras <- match.call(expand.dots = FALSE)$...
call <- getCall(object)
for (i in names(extras)) {
existing <- !is.na(match(names(extras), names(call)))
for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
if (any(!existing)) {
call <- c(as.list(call), extras[!existing])
call <- as.call(call)
}
}
eval(call, env=environment(formula(object)))
## enclos=parent.frame() doesn't help
}
update.both <- function(object,...) {
extras <- match.call(expand.dots = FALSE)$...
call <- getCall(object)
for (i in names(extras)) {
existing <- !is.na(match(names(extras), names(call)))
for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
if (any(!existing)) {
call <- c(as.list(call), extras[!existing])
call <- as.call(call)
}
}
pf <- parent.frame() ## save parent frame in case we need it
tryCatch(eval(call, env=environment(formula(object))),
error=function(e) {
eval(call, pf)
})
}
### TEST CASES
set.seed(101)
d <- data.frame(x=1:10,y=rnorm(10))
m1 <- lm(y~x,data=d)
##' define data within function, return fitted model
f1 <- function() {
d2 <- d
lm(y~x,data=d2)
return(lm(y~x,data=d2))
}
##' define (and modify) data within function, try to update
##' model fitted elsewhere
f2 <- function(m) {
d2 <- d; d2[1] <- d2[1]+0 ## force copy
update.default(m,data=d2)
}
##' define (and modify) data within function, try to update
##' model fitted elsewhere (use envformula)
f3 <- function(m) {
d2 <- d; d2[1] <- d2[1]+0 ## force copy
update.envformula(m,data=d2)
}
##' hack: first try the formula, then the parent frame
##' if that doesn't work for any reason
f4 <- function(m) {
d2 <- d; d2[1] <- d2[1]+0 ## force copy
update.both(m,data=d2)
}
## Case 1: fit within function
m2 <- f1()
try(update.default(m2)) ## default: object 'd2' not found
m3A <- update.envformula(m2) ## envformula: works
m3B <- update.both(m2) ## works
## Case 2: update within function
m4A <- f2(m1) ## default: works
try(f3(m1)) ## envformula: object 'd2' not found
m4B <- f4(m1) ## works
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAEBAgAGBQJVDvHBAAoJEOCV5YRblxUHtnwH/2452Fqmsm//LxNeXxhNrs/G
3ss8rPV2EVJQFjAyO34zzuYZVp1mWlgt7C1L7nXcH4lcf66gYfj75X/yVvMxxwmS
uXEP9HKr4TR5D6Ukf16qqtK41PhTkjS+RDaEpj6BVq9YxlYb53kSjKF+anrf08SL
K6i2L+c4nJIwk56CCp0Re/EIDCNeW5lXYX9jdPAalMoxSHTG8wmytvq0x89+KsRC
aUTpdylPka70vwBQle9W4OEyhBpADIHfEg8csYXZ/MeOKM0bqCRu2IU3OyuCsxyX
Pbo3w1Z7x0e+2WoX4PcltweYK4PJC9O10v+RKYaI5YRy1dFWMQWcPoAKRKf7jwY=
=S7zP
-----END PGP SIGNATURE-----
More information about the R-devel
mailing list