[R-sig-ME] What is the maximum number of groups?

Ben Bolker bolker at ufl.edu
Tue Jan 6 16:11:04 CET 2009


  If I understand what you're doing correctly, you're setting
up an "individual-level" covariate, essentially fitting
a binomial-normal model (or binomial-logitnormal).  That makes sense,
although I'm a bit surprised it worked -- in some versions of lme4 it
has triggered an error message about too many random effect parameters.
For example see

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q1/000603.html

and following discussion.

  Ben Bolker

Andrew J Tyre wrote:
> Hi All,
> 
> I've been thinking alot about the earlier productive and helpful (for me!) 
> discussion on overdispersion and glmms etc. I haven't gotten back to my 
> grasshopper example yet, but I will as soon as the students complete 
> another component of the data collection. 
> 
> I have just come across an example in which I was able to account for most 
> of the overdispersion - enough to make me happy anyway - but I'm worried 
> about what I did to do it. The data are the number of traps at a site on 
> one day that caught or did not catch a deer, so binomial data. There are 
> several sites sampled in several different years, and a couple of weather 
> covariates that are the same for all traps at a given site on a particular 
> day; there are no trap level covariates, but many days of trapping within 
> each site/year combination. We expect there to be variation among 
> site/year combinations related to the number of deer at a site, variation 
> among days due to weather (some covariate data) or other factors, and 
> variation among traps due to the location of the trap relative to the deer 
> population. 
> 
> Some simulated data:
> 
> library(lme4)
> 
> # simulate some data
> x = runif(300,1,10)
> siteyear = factor(rep(1:10,each=30))
> logit.p = x*0.2 - 0.025*x^2 + rep(rnorm(10,0,0.4),each=30) + 
> rnorm(300,0,0.2)
> p = 1/(1+exp(-logit.p))
> size = sample(5:15,300,replace=TRUE)
> y = rbinom(300,prob=p,size=size)
> siteday=factor(1:300)
> mm.1 = lmer(cbind(y,size-y)~x+I(x^2)+(1|siteday),family=binomial)
> mm.2 = lmer(cbind(y,size-y)~x+I(x^2)+(1|siteyear),family=binomial)
> mm.3 = lmer(cbind(y,size-y)~x+I(x^2)+siteyear+(1|siteday),family=binomial)
> 
> mm.list = list(mm.1,mm.2,mm.3)
> k = c(4,4,13)
> ll = sapply(mm.list,logLik)
> # use chat estimate to get quasiAIC
> chat = deviance(mm.3)/300
> 
> qaic = -2*ll/chat+2*k
> delta = qaic-min(qaic)
> AIC.table=cbind(qaic,k,delta,w = exp(-(delta/2))/sum(exp(-delta/2)))
> AIC.table
>        qaic  k    delta            w
> ML 404.6663  4 78.66631 8.269964e-18
> ML 340.3928  4 14.39277 7.487291e-04
> ML 326.0000 13  0.00000 9.992513e-01
>> chat
>       ML 
> 1.106134 
> 
> Now this is pretty close to the structure that I have with the real data. 
> All models had the (1|siteday) random effect, various combinations of 
> covariates of interest (x in the example), and then either did or did not 
> include siteyear as a fixed effect. In the real example, as with my 
> simulated example, the model with both a fixed effect of siteyear and a 
> random effect of siteday is the best model. The best model makes 
> biological sense. The question: is having as many groups as observations 
> an issue? It seems to work but ... I didn't start worrying about it until 
> afterwards. Does it work because there are multiple traps per siteday? Any 
> insights appreciated.
> 
>> sessionInfo()
> R version 2.7.2 (2008-08-25) 
> i386-pc-mingw32 
> 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods   base 
> 
> other attached packages:
> [1] lme4_0.999375-27   Matrix_0.999375-15 lattice_0.17-13 
> 
> loaded via a namespace (and not attached):
> [1] grid_2.7.2  nlme_3.1-89
> 
> Drew Tyre
> 
> School of Natural Resources
> University of Nebraska-Lincoln
> 416 Hardin Hall, East Campus
> 3310 Holdrege Street
> Lincoln, NE 68583-0974
> 
> phone: +1 402 472 4054 
> fax: +1 402 472 2946
> email: atyre2 at unl.edu
> http://snr.unl.edu/tyre
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc




More information about the R-sig-mixed-models mailing list