[R-sig-ME] modelling saturated random effects with glmm

Tue Jul 28 13:58:18 CEST 2009

jos matejus wrote:
> Thank you Greg and Ben for clearing that up. Sometimes I get so
> caught up in the detail of mixed modelling that I forget some of the 
> fundamentals. By the way, I can see how adding a random effect level 
> per observation would account for some of the heterogeniety causing 
> overdispersion, but wouldn't this be dependent on the potential 
> underlying causes of the overdispersion? For example, if you have a 
> high proportion of  zeros in the data, would this approach still be 
> valid? Wouldn't it be better to address the causes of overdispersion 
> directly by refining the fixed and random effects structure more or
> by using a more appropriate distribution such as a negative binomial,
> zip or zinerb?
> 
> Best Jos

   Yes, of course adding more known covariates or grouping factors, or
using a zero-inflated distribution if the data suggest it, might be
better than adding individual-level heterogeneity -- but it depends on
the data.  Remember that "lots of zeros" is not in itself a prescription
to use a zero-inflated distribution -- Poissons or negative binomials
with small means also have lots of zeros.

Warton, David I. “Many zeros does not mean zero inflation: comparing the
goodness-of-fit of parametric models to multivariate abundance data.”
Environmetrics 16, no. 3 (2005): 275-289.

> 
> 2009/7/27 Greg Snow <Greg.Snow at imail.org>:
>> This is a basic property of the distributions.
>> 
>> The normal distribution has 2 parameters, the mean and the variance
>> which are independent of each other.  Therefore in any type of
>> model based on the normal distribution you need at least 1 degree
>> of freedom left over after estimating the mean in order to estimate
>> the variance.
>> 
>> The poisson distribution only has 1 parameter because the variance
>> is equal to the mean in the poisson, so you can use all the degrees
>> of freedom estimating the mean, and that gives you the variance,
>> you don't need additional information to estimate it.
>> 
>> All this of course is dependent on your assumptions about the
>> distributions being reasonable (the routines do what you tell them
>> too whether they make sense or not).  And any model that uses all
>> or even the majority of the degrees of freedom is unlikely to be
>> very precise or informative even if you do get an "answer".
>> 
>> Hope this helps,
>> 
>> -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center 
>> Intermountain Healthcare greg.snow at imail.org 801.408.8111
>> 
>> 
>>> -----Original Message----- From:
>>> r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed- 
>>> models-bounces at r-project.org] On Behalf Of jos matejus Sent:
>>> Monday, July 27, 2009 8:19 AM To:
>>> r-sig-mixed-models at r-project.org Subject: [R-sig-ME] modelling
>>> saturated random effects with glmm
>>> 
>>> Dear all,
>>> 
>>> I was wondering whether anyone could enlighten me on the
>>> following.
>>> 
>>> Why is it I can fit a generalized linear mixed model (family =
>>> poisson for example) with lmer where I have as many levels of my
>>> random effect as data points whereas with a linear mixed effects
>>> model (gaussian distributed errors) I get an error message. I
>>> understand that the random effect variance is completely
>>> confounded with the residual variance in the case of a linear
>>> mixed model, but why is this not so with a generalized linear
>>> mixed model?
>>> 
>>> for example
>>> 
>>> data(ergoStool, package="nlme") # load data ergoStool$rantest <-
>>> 1:36 #create a pseudo random effect to illustrate
>>> 
>>> library(lme4)
>>> 
>>> stool.lmm <- lmer(effort~Type+(1|rantest),  data=ergoStool) 
>>> #Error: length(levels(dm$flist[[1]])) < length(Y) is not TRUE
>>> 
>>> stool.glmm <- lmer(effort~Type+(1|rantest) , family=poisson, 
>>> data=ergoStool)
>>> 
>>> summary(stool.glmm)
>>> 
>>> Generalized linear mixed model fit by the Laplace approximation 
>>> #Formula: effort ~ Type + (1 | rantest) Data: ergoStool AIC   BIC
>>> logLik deviance 19.47 27.39 -4.737    9.474 Random effects: 
>>> Groups  Name        Variance Std.Dev. rantest (Intercept)  0
>>> 0 Number of obs: 36, groups: rantest, 36
>>> 
>>> Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept)
>>> 2.14658    0.11396  18.836   <2e-16 *** TypeT2       0.37469
>>> 0.14804   2.531   0.0114 * TypeT3       0.23091    0.15263
>>> 1.513   0.1303 TypeT4       0.07503    0.15823   0.474   0.6354 
>>> --- Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' '
>>> 1
>>> 
>>> Correlation of Fixed Effects: (Intr) TypeT2 TypeT3 TypeT2 -0.770 
>>> TypeT3 -0.747  0.575 TypeT4 -0.720  0.554  0.538
>>> 
>>> Many thanks in advance Jos
>>> 
>>> _______________________________________________ 
>>> R-sig-mixed-models at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc