[Rd] robust updating methods
Ben Bolker
bbolker at gmail.com
Fri Mar 27 22:36:02 CET 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
[Sorry to those who don't like it for top-posting]
Thierry, I'm curious whether this addresses your problem (although
we don't have a hard timetable for the next release [it has to avoid
conflicts with the 3.2.0 release in 2.5 weeks at the very least], so
this might be problematic if your package needs to depend on it).
I'm still curious whether there are any ideas/opinions from other
readers. Has anyone else struggled with this? Is there a canonical
solution?
Ben Bolker
On 15-03-24 07:55 PM, Ben Bolker wrote:
> On 15-03-23 12:55 PM, Thierry Onkelinx wrote:
>> Dear Ben,
>
>> Last week I was struggling with incorporating lme4 into a
>> package. I traced the problem and made a reproducible example (
>> https://github.com/ThierryO/testlme4). It looks very simular to
>> the problem you describe.
>
>> The 'tests' directory contains the reproducible examples.
>> confint() of a model as returned by a function fails. It even
>> fails when I try to calculate the confint() inside the same
>> function as the glmer() call (see the fit_model_ci function).
>
>> Best regards,
>
>> Thierry
>
>
> Ugh. I can get this to work if I also try searching up the call
> stack, as follows (within update.merMod). This feels like "code
> smell" to me though -- i.e., if I have to hack this hard I must be
> doing something wrong/misunderstanding how the problem *should* be
> done.
>
>
> if (evaluate) { ff <- environment(formula(object)) pf <-
> parent.frame() ## save parent frame in case we need it sf <-
> sys.frames()[[1]] tryCatch(eval(call, env=ff), error=function(e) {
> tryCatch(eval(call, env=sf), error=function(e) { eval(call, pf) })
> }) } else call
>
> Here is an adapted even-more-minimal version of your code, which
> seems to work with the version of update.merMod I just pushed to
> github, but fails for glm():
>
>
> ##
> https://github.com/ThierryO/testlme4/blob/master/R/fit_model_ci.R
> fit_model_ci <- function(formula, dataset, mfun=glmer){ model <-
> mfun( formula = formula, data = dataset, family = "poisson" ) ci <-
> confint(model) return(list(model = model, confint = ci)) }
>
> library("lme4") set.seed(101) dd <-
> data.frame(f=factor(rep(1:10,each=100)), y=rpois(2,1000))
> fit_model_ci(y~(1|f),dataset=dd)
> fit_model_ci(y~(1|f),dataset=dd,mfun=glm)
>
>
>
>
>
>> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek /
>> Research Institute for Nature and Forest team Biometrie &
>> Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25 1070 Anderlecht Belgium
>
>> To call in the statistician after the experiment is done may be
>> no more than asking him to perform a post-mortem examination: he
>> may be able to say what the experiment died of. ~ Sir Ronald
>> Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer
>> does not ensure that a reasonable answer can be extracted from a
>> given body of data. ~ John Tukey
>
>> 2015-03-22 17:45 GMT+01:00 Ben Bolker <bbolker at gmail.com>:
>
>> WARNING: this is long. Sorry I couldn't find a way to compress
>> it.
>
>> Is there a reasonable way to design an update method so that it's
>> robust to a variety of reasonable use cases of generating calls
>> or data inside or outside a function? Is it even possible?
>> Should I just tell users "don't do that"?
>
>> * `update.default()` uses `eval(call, parent.frame())`; this
>> fails when the call depends on objects that were defined in a
>> different environment (e.g., when the data are generated and the
>> model initially fitted within a function scope)
>
>> * an alternative is to store the original environment in which
>> the fitting is done in the environment of the formula and use
>> `eval(call, env=environment(formula(object)))`; this fails if the
>> user tries to update the model originally fitted outside a
>> function with data modified within a function ...
>
>> * I think I've got a hack that works below, which first tries in
>> the environment of the formula and falls back to the parent
>> frame if that fails, but I wonder if I'm missing something much
>> simpler ..
>
>> Thoughts? My understanding of environments and frames is still,
>> after all these years, not what it should be ...
>
>> I've thought of some other workarounds, none entirely
>> satisfactory:
>
>> * force evaluation of all elements in the original call *
>> printing components of the call can get ugly (can save the
>> original call before evaluating) * large objects in the call get
>> duplicated * don't use `eval(call)` for updates; instead try to
>> store everything internally * this works OK but has the same
>> drawback of potentially storing large extra copies * we could try
>> to use the model frame (which is stored already), but there are
>> issues with this (the basis of a whole separate rant) because the
>> model frame stores something in between predictor variables and
>> input variables. For example
>
>> d <- data.frame(y=1:10,x=runif(10))
>> names(model.frame(lm(y~log(x),data=d))) ## "y" "log(x)"
>
>> So if we wanted to do something like update to "y ~ sqrt(x)", it
>> wouldn't work ...
>
>> ================== update.envformula <- function(object,...) {
>> extras <- match.call(expand.dots = FALSE)$... call <-
>> getCall(object) for (i in names(extras)) { existing <-
>> !is.na(match(names(extras), names(call))) for (a in
>> names(extras)[existing]) call[[a]] <- extras[[a]] if
>> (any(!existing)) { call <- c(as.list(call), extras[!existing])
>> call <- as.call(call) } } eval(call,
>> env=environment(formula(object))) ## enclos=parent.frame()
>> doesn't help }
>
>> update.both <- function(object,...) { extras <-
>> match.call(expand.dots = FALSE)$... call <- getCall(object) for
>> (i in names(extras)) { existing <- !is.na(match(names(extras),
>> names(call))) for (a in names(extras)[existing]) call[[a]] <-
>> extras[[a]] if (any(!existing)) { call <- c(as.list(call),
>> extras[!existing]) call <- as.call(call) } } pf <-
>> parent.frame() ## save parent frame in case we need it
>> tryCatch(eval(call, env=environment(formula(object))),
>> error=function(e) { eval(call, pf) }) }
>
>> ### TEST CASES
>
>> set.seed(101) d <- data.frame(x=1:10,y=rnorm(10)) m1 <-
>> lm(y~x,data=d)
>
>> ##' define data within function, return fitted model f1 <-
>> function() { d2 <- d lm(y~x,data=d2) return(lm(y~x,data=d2)) }
>> ##' define (and modify) data within function, try to update ##'
>> model fitted elsewhere f2 <- function(m) { d2 <- d; d2[1] <-
>> d2[1]+0 ## force copy update.default(m,data=d2) } ##' define (and
>> modify) data within function, try to update ##' model fitted
>> elsewhere (use envformula) f3 <- function(m) { d2 <- d; d2[1] <-
>> d2[1]+0 ## force copy update.envformula(m,data=d2) }
>
>> ##' hack: first try the formula, then the parent frame ##' if
>> that doesn't work for any reason f4 <- function(m) { d2 <- d;
>> d2[1] <- d2[1]+0 ## force copy update.both(m,data=d2) }
>
>> ## Case 1: fit within function m2 <- f1()
>> try(update.default(m2)) ## default: object 'd2' not found m3A <-
>> update.envformula(m2) ## envformula: works m3B <-
>> update.both(m2) ## works
>
>> ## Case 2: update within function m4A <- f2(m1) ## default:
>> works try(f3(m1)) ## envformula: object 'd2' not found m4B <-
>> f4(m1) ## works
>
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAEBAgAGBQJVFc1CAAoJEOCV5YRblxUHF+MH/3Y9uFZFolhx5b5jWSyXwQgp
i9oawx4K6il0qiAiDiO5D7NSSdc0u9jlgj8NjH0G2O9u3ctpvcYNVwa7cP9288Xz
xRyInnnh2FIpT6W0XyzJDivw5EX3IkyYuv6eDNqVyGcYXkvzJMA+vwMMWdGWEZbL
jKtDc0trG+9yJnwIi6DW6IQWPovrDaNxEinS+V7+DmYACQvJ4P2kg2u/ZsxAx89q
mcA1pS5usJjkOiQwBVUvV7l2UKNhHPFNwbBK1QdHgpP7PTdB52EQr+IyERhpf56s
8tYyNbSSPWoG9vt6/1pKyUK4iNRBtGgxtuozAv5OUjF8VGWGwUXBLo5G2yrBbs4=
=o1PJ
-----END PGP SIGNATURE-----
More information about the R-devel
mailing list