[Rd] robust updating methods
Thierry Onkelinx
thierry.onkelinx at inbo.be
Mon Mar 23 17:55:52 CET 2015
Dear Ben,
Last week I was struggling with incorporating lme4 into a package. I traced
the problem and made a reproducible example (
https://github.com/ThierryO/testlme4). It looks very simular to the
problem you describe.
The 'tests' directory contains the reproducible examples. confint() of a
model as returned by a function fails. It even fails when I try to
calculate the confint() inside the same function as the glmer() call (see
the fit_model_ci function).
Best regards,
Thierry
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
2015-03-22 17:45 GMT+01:00 Ben Bolker <bbolker at gmail.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> WARNING: this is long. Sorry I couldn't find a way to compress it.
>
> Is there a reasonable way to design an update method so that it's
> robust to a variety of reasonable use cases of generating calls or
> data inside or outside a function? Is it even possible? Should I
> just tell users "don't do that"?
>
> * `update.default()` uses `eval(call, parent.frame())`; this fails
> when the call depends on objects that were defined in a different
> environment (e.g., when the data are generated and the model
> initially fitted within a function scope)
>
> * an alternative is to store the original environment in which the
> fitting is done in the environment of the formula and use
> `eval(call, env=environment(formula(object)))`; this fails if the
> user tries to update the model originally fitted outside a function
> with data modified within a function ...
>
> * I think I've got a hack that works below, which first tries in the
> environment of the formula and falls back to the parent frame if
> that fails, but I wonder if I'm missing something much simpler ..
>
> Thoughts? My understanding of environments and frames is still,
> after all these years, not what it should be ...
>
> I've thought of some other workarounds, none entirely satisfactory:
>
> * force evaluation of all elements in the original call
> * printing components of the call can get ugly (can save the
> original call before evaluating)
> * large objects in the call get duplicated
> * don't use `eval(call)` for updates; instead try to store everything
> internally
> * this works OK but has the same drawback of potentially storing
> large extra copies
> * we could try to use the model frame (which is stored already),
> but there are issues with this (the basis of a whole separate rant)
> because the model frame stores something in between predictor
> variables and input variables. For example
>
> d <- data.frame(y=1:10,x=runif(10))
> names(model.frame(lm(y~log(x),data=d)))
> ## "y" "log(x)"
>
> So if we wanted to do something like update to "y ~ sqrt(x)",
> it wouldn't work ...
>
> ==================
> update.envformula <- function(object,...) {
> extras <- match.call(expand.dots = FALSE)$...
> call <- getCall(object)
> for (i in names(extras)) {
> existing <- !is.na(match(names(extras), names(call)))
> for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
> if (any(!existing)) {
> call <- c(as.list(call), extras[!existing])
> call <- as.call(call)
> }
> }
> eval(call, env=environment(formula(object)))
> ## enclos=parent.frame() doesn't help
> }
>
> update.both <- function(object,...) {
> extras <- match.call(expand.dots = FALSE)$...
> call <- getCall(object)
> for (i in names(extras)) {
> existing <- !is.na(match(names(extras), names(call)))
> for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
> if (any(!existing)) {
> call <- c(as.list(call), extras[!existing])
> call <- as.call(call)
> }
> }
> pf <- parent.frame() ## save parent frame in case we need it
> tryCatch(eval(call, env=environment(formula(object))),
> error=function(e) {
> eval(call, pf)
> })
> }
>
> ### TEST CASES
>
> set.seed(101)
> d <- data.frame(x=1:10,y=rnorm(10))
> m1 <- lm(y~x,data=d)
>
> ##' define data within function, return fitted model
> f1 <- function() {
> d2 <- d
> lm(y~x,data=d2)
> return(lm(y~x,data=d2))
> }
> ##' define (and modify) data within function, try to update
> ##' model fitted elsewhere
> f2 <- function(m) {
> d2 <- d; d2[1] <- d2[1]+0 ## force copy
> update.default(m,data=d2)
> }
> ##' define (and modify) data within function, try to update
> ##' model fitted elsewhere (use envformula)
> f3 <- function(m) {
> d2 <- d; d2[1] <- d2[1]+0 ## force copy
> update.envformula(m,data=d2)
> }
>
> ##' hack: first try the formula, then the parent frame
> ##' if that doesn't work for any reason
> f4 <- function(m) {
> d2 <- d; d2[1] <- d2[1]+0 ## force copy
> update.both(m,data=d2)
> }
>
> ## Case 1: fit within function
> m2 <- f1()
> try(update.default(m2)) ## default: object 'd2' not found
> m3A <- update.envformula(m2) ## envformula: works
> m3B <- update.both(m2) ## works
>
> ## Case 2: update within function
> m4A <- f2(m1) ## default: works
> try(f3(m1)) ## envformula: object 'd2' not found
> m4B <- f4(m1) ## works
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
>
> iQEcBAEBAgAGBQJVDvHBAAoJEOCV5YRblxUHtnwH/2452Fqmsm//LxNeXxhNrs/G
> 3ss8rPV2EVJQFjAyO34zzuYZVp1mWlgt7C1L7nXcH4lcf66gYfj75X/yVvMxxwmS
> uXEP9HKr4TR5D6Ukf16qqtK41PhTkjS+RDaEpj6BVq9YxlYb53kSjKF+anrf08SL
> K6i2L+c4nJIwk56CCp0Re/EIDCNeW5lXYX9jdPAalMoxSHTG8wmytvq0x89+KsRC
> aUTpdylPka70vwBQle9W4OEyhBpADIHfEg8csYXZ/MeOKM0bqCRu2IU3OyuCsxyX
> Pbo3w1Z7x0e+2WoX4PcltweYK4PJC9O10v+RKYaI5YRy1dFWMQWcPoAKRKf7jwY=
> =S7zP
> -----END PGP SIGNATURE-----
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list