[R-sig-ME] What is the maximum number of groups?
Andrew J Tyre
atyre2 at unlnotes.unl.edu
Tue Jan 6 19:55:57 CET 2009
Right - I hadn't thought about it that way, but I think a
"binomial-lognormal" is correct:
log(p/(1-p) = X*Beta + epsilon
episilon~N(0,sigma)
The summary says I have 300 groups and 300 observations, so the random
effects vector is the same size as the dataset. If I attempt to include
the "siteyear" level variation as a random effect I get the message
Error in mer_finalize(ans) : q = 310 > n = 300
which is the error described in the previous posts. So its working here
because I only have the individual level random effect.
cheers,
Drew Tyre
School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974
phone: +1 402 472 4054
fax: +1 402 472 2946
email: atyre2 at unl.edu
http://snr.unl.edu/tyre
Ben Bolker <bolker at ufl.edu>
Sent by: r-sig-mixed-models-bounces at r-project.org
01/06/2009 09:13 AM
To
Andrew J Tyre <atyre2 at unlnotes.unl.edu>, R Mixed Models
<r-sig-mixed-models at r-project.org>
cc
Subject
Re: [R-sig-ME] What is the maximum number of groups?
If I understand what you're doing correctly, you're setting
up an "individual-level" covariate, essentially fitting
a binomial-normal model (or binomial-logitnormal). That makes sense,
although I'm a bit surprised it worked -- in some versions of lme4 it
has triggered an error message about too many random effect parameters.
For example see
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q1/000603.html
and following discussion.
Ben Bolker
Andrew J Tyre wrote:
> Hi All,
>
> I've been thinking alot about the earlier productive and helpful (for
me!)
> discussion on overdispersion and glmms etc. I haven't gotten back to my
> grasshopper example yet, but I will as soon as the students complete
> another component of the data collection.
>
> I have just come across an example in which I was able to account for
most
> of the overdispersion - enough to make me happy anyway - but I'm worried
> about what I did to do it. The data are the number of traps at a site on
> one day that caught or did not catch a deer, so binomial data. There are
> several sites sampled in several different years, and a couple of
weather
> covariates that are the same for all traps at a given site on a
particular
> day; there are no trap level covariates, but many days of trapping
within
> each site/year combination. We expect there to be variation among
> site/year combinations related to the number of deer at a site,
variation
> among days due to weather (some covariate data) or other factors, and
> variation among traps due to the location of the trap relative to the
deer
> population.
>
> Some simulated data:
>
> library(lme4)
>
> # simulate some data
> x = runif(300,1,10)
> siteyear = factor(rep(1:10,each=30))
> logit.p = x*0.2 - 0.025*x^2 + rep(rnorm(10,0,0.4),each=30) +
> rnorm(300,0,0.2)
> p = 1/(1+exp(-logit.p))
> size = sample(5:15,300,replace=TRUE)
> y = rbinom(300,prob=p,size=size)
> siteday=factor(1:300)
> mm.1 = lmer(cbind(y,size-y)~x+I(x^2)+(1|siteday),family=binomial)
> mm.2 = lmer(cbind(y,size-y)~x+I(x^2)+(1|siteyear),family=binomial)
> mm.3 =
lmer(cbind(y,size-y)~x+I(x^2)+siteyear+(1|siteday),family=binomial)
>
> mm.list = list(mm.1,mm.2,mm.3)
> k = c(4,4,13)
> ll = sapply(mm.list,logLik)
> # use chat estimate to get quasiAIC
> chat = deviance(mm.3)/300
>
> qaic = -2*ll/chat+2*k
> delta = qaic-min(qaic)
> AIC.table=cbind(qaic,k,delta,w = exp(-(delta/2))/sum(exp(-delta/2)))
> AIC.table
> qaic k delta w
> ML 404.6663 4 78.66631 8.269964e-18
> ML 340.3928 4 14.39277 7.487291e-04
> ML 326.0000 13 0.00000 9.992513e-01
>> chat
> ML
> 1.106134
>
> Now this is pretty close to the structure that I have with the real
data.
> All models had the (1|siteday) random effect, various combinations of
> covariates of interest (x in the example), and then either did or did
not
> include siteyear as a fixed effect. In the real example, as with my
> simulated example, the model with both a fixed effect of siteyear and a
> random effect of siteday is the best model. The best model makes
> biological sense. The question: is having as many groups as observations
> an issue? It seems to work but ... I didn't start worrying about it
until
> afterwards. Does it work because there are multiple traps per siteday?
Any
> insights appreciated.
>
>> sessionInfo()
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] lme4_0.999375-27 Matrix_0.999375-15 lattice_0.17-13
>
> loaded via a namespace (and not attached):
> [1] grid_2.7.2 nlme_3.1-89
>
> Drew Tyre
>
> School of Natural Resources
> University of Nebraska-Lincoln
> 416 Hardin Hall, East Campus
> 3310 Holdrege Street
> Lincoln, NE 68583-0974
>
> phone: +1 402 472 4054
> fax: +1 402 472 2946
> email: atyre2 at unl.edu
> http://snr.unl.edu/tyre
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list