[Rd] robust updating methods

Ben Bolker bbolker at gmail.com
Fri Mar 27 22:36:02 CET 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

  [Sorry to those who don't like it for top-posting]

  Thierry, I'm curious whether this addresses your problem (although
we don't have a hard timetable for the next release [it has to avoid
conflicts with the 3.2.0 release in 2.5 weeks at the very least], so
this might be problematic if your package needs to depend on it).

  I'm still curious whether there are any ideas/opinions from other
readers.  Has anyone else struggled with this?  Is there a canonical
solution?

 Ben Bolker


On 15-03-24 07:55 PM, Ben Bolker wrote:
> On 15-03-23 12:55 PM, Thierry Onkelinx wrote:
>> Dear Ben,
> 
>> Last week I was struggling with incorporating lme4 into a
>> package. I traced the problem and made a reproducible example ( 
>> https://github.com/ThierryO/testlme4).  It looks very simular to 
>> the problem you describe.
> 
>> The 'tests' directory contains the reproducible examples.
>> confint() of a model as returned by a function fails. It even
>> fails when I try to calculate the confint() inside the same
>> function as the glmer() call (see the fit_model_ci function).
> 
>> Best regards,
> 
>> Thierry
> 
> 
> Ugh.  I can get this to work if I also try searching up the call 
> stack, as follows (within update.merMod).  This feels like "code 
> smell" to me though -- i.e., if I have to hack this hard I must be 
> doing something wrong/misunderstanding how the problem *should* be
> done.
> 
> 
> if (evaluate) { ff <- environment(formula(object)) pf <-
> parent.frame()  ## save parent frame in case we need it sf <-
> sys.frames()[[1]] tryCatch(eval(call, env=ff), error=function(e) { 
> tryCatch(eval(call, env=sf), error=function(e) { eval(call, pf) }) 
> }) } else call
> 
> Here is an adapted even-more-minimal version of your code, which 
> seems to work with the version of update.merMod I just pushed to 
> github, but fails for glm():
> 
> 
> ##
> https://github.com/ThierryO/testlme4/blob/master/R/fit_model_ci.R 
> fit_model_ci <- function(formula, dataset, mfun=glmer){ model <-
> mfun( formula = formula, data = dataset, family = "poisson" ) ci <-
> confint(model) return(list(model = model, confint = ci)) }
> 
> library("lme4") set.seed(101) dd <-
> data.frame(f=factor(rep(1:10,each=100)), y=rpois(2,1000)) 
> fit_model_ci(y~(1|f),dataset=dd) 
> fit_model_ci(y~(1|f),dataset=dd,mfun=glm)
> 
> 
> 
> 
> 
>> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / 
>> Research Institute for Nature and Forest team Biometrie & 
>> Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25 1070 Anderlecht Belgium
> 
>> To call in the statistician after the experiment is done may be
>> no more than asking him to perform a post-mortem examination: he
>> may be able to say what the experiment died of. ~ Sir Ronald
>> Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer
>> does not ensure that a reasonable answer can be extracted from a
>> given body of data. ~ John Tukey
> 
>> 2015-03-22 17:45 GMT+01:00 Ben Bolker <bbolker at gmail.com>:
> 
>> WARNING: this is long.  Sorry I couldn't find a way to compress 
>> it.
> 
>> Is there a reasonable way to design an update method so that it's
>>  robust to a variety of reasonable use cases of generating calls
>> or data inside or outside a function?  Is it even possible?
>> Should I just tell users "don't do that"?
> 
>> * `update.default()` uses `eval(call, parent.frame())`; this
>> fails when the call depends on objects that were defined in a
>> different environment (e.g., when the data are generated and the
>> model initially fitted within a function scope)
> 
>> * an alternative is to store the original environment in which
>> the fitting is done in the environment of the formula and use 
>> `eval(call, env=environment(formula(object)))`; this fails if the
>>  user tries to update the model originally fitted outside a 
>> function with data modified within a function ...
> 
>> * I think I've got a hack that works below, which first tries in 
>> the environment of the formula and falls back to the parent
>> frame if that fails, but I wonder if I'm missing something much
>> simpler ..
> 
>> Thoughts?  My understanding of environments and frames is still,
>>  after all these years, not what it should be ...
> 
>> I've thought of some other workarounds, none entirely 
>> satisfactory:
> 
>> * force evaluation of all elements in the original call *
>> printing components of the call can get ugly (can save the
>> original call before evaluating) * large objects in the call get
>> duplicated * don't use `eval(call)` for updates; instead try to
>> store everything internally * this works OK but has the same
>> drawback of potentially storing large extra copies * we could try
>> to use the model frame (which is stored already), but there are
>> issues with this (the basis of a whole separate rant) because the
>> model frame stores something in between predictor variables and
>> input variables. For example
> 
>> d <- data.frame(y=1:10,x=runif(10)) 
>> names(model.frame(lm(y~log(x),data=d))) ## "y" "log(x)"
> 
>> So if we wanted to do something like update to "y ~ sqrt(x)", it 
>> wouldn't work ...
> 
>> ================== update.envformula <- function(object,...) { 
>> extras <- match.call(expand.dots = FALSE)$... call <- 
>> getCall(object) for (i in names(extras)) { existing <- 
>> !is.na(match(names(extras), names(call))) for (a in 
>> names(extras)[existing]) call[[a]] <- extras[[a]] if 
>> (any(!existing)) { call <- c(as.list(call), extras[!existing])
>> call <- as.call(call) } } eval(call,
>> env=environment(formula(object))) ## enclos=parent.frame()
>> doesn't help }
> 
>> update.both <- function(object,...) { extras <- 
>> match.call(expand.dots = FALSE)$... call <- getCall(object) for
>> (i in names(extras)) { existing <- !is.na(match(names(extras), 
>> names(call))) for (a in names(extras)[existing]) call[[a]] <- 
>> extras[[a]] if (any(!existing)) { call <- c(as.list(call), 
>> extras[!existing]) call <- as.call(call) } } pf <-
>> parent.frame() ## save parent frame in case we need it
>> tryCatch(eval(call, env=environment(formula(object))),
>> error=function(e) { eval(call, pf) }) }
> 
>> ### TEST CASES
> 
>> set.seed(101) d <- data.frame(x=1:10,y=rnorm(10)) m1 <- 
>> lm(y~x,data=d)
> 
>> ##' define data within function, return fitted model f1 <- 
>> function() { d2 <- d lm(y~x,data=d2) return(lm(y~x,data=d2)) }
>> ##' define (and modify) data within function, try to update ##'
>> model fitted elsewhere f2 <- function(m) { d2 <- d; d2[1] <-
>> d2[1]+0 ## force copy update.default(m,data=d2) } ##' define (and
>> modify) data within function, try to update ##' model fitted
>> elsewhere (use envformula) f3 <- function(m) { d2 <- d; d2[1] <-
>> d2[1]+0 ## force copy update.envformula(m,data=d2) }
> 
>> ##' hack: first try the formula, then the parent frame ##' if
>> that doesn't work for any reason f4 <- function(m) { d2 <- d;
>> d2[1] <- d2[1]+0 ## force copy update.both(m,data=d2) }
> 
>> ## Case 1: fit within function m2 <- f1()
>> try(update.default(m2)) ## default: object 'd2' not found m3A <-
>> update.envformula(m2)  ## envformula: works m3B <-
>> update.both(m2)        ## works
> 
>> ## Case 2: update within function m4A <- f2(m1)  ## default:
>> works try(f3(m1))    ## envformula: object 'd2' not found m4B <-
>> f4(m1) ## works
> 
>>> 
>>> ______________________________________________ 
>>> R-devel at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
> 
> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVFc1CAAoJEOCV5YRblxUHF+MH/3Y9uFZFolhx5b5jWSyXwQgp
i9oawx4K6il0qiAiDiO5D7NSSdc0u9jlgj8NjH0G2O9u3ctpvcYNVwa7cP9288Xz
xRyInnnh2FIpT6W0XyzJDivw5EX3IkyYuv6eDNqVyGcYXkvzJMA+vwMMWdGWEZbL
jKtDc0trG+9yJnwIi6DW6IQWPovrDaNxEinS+V7+DmYACQvJ4P2kg2u/ZsxAx89q
mcA1pS5usJjkOiQwBVUvV7l2UKNhHPFNwbBK1QdHgpP7PTdB52EQr+IyERhpf56s
8tYyNbSSPWoG9vt6/1pKyUK4iNRBtGgxtuozAv5OUjF8VGWGwUXBLo5G2yrBbs4=
=o1PJ
-----END PGP SIGNATURE-----



More information about the R-devel mailing list