[R-sig-ME] Fwd: random effects heteroscedasticity glmm
Thierry Onkelinx
thierry.onkelinx at inbo.be
Wed May 13 10:23:49 CEST 2015
Dear Lauren,
Note that you have only 8 observations per site. This might give the
impression of heteroscedasticity. See the examples below. Your model first
model seems reasonable. The second model is too complex for the data.
# 8 observations per category. Some replicates display "heteroscedasticity".
set.seed(123456)
dataset <- data.frame(
Lambda = rep(c(10, 100), 8)
)
layout(matrix(1:16, nrow=4))
replicate(16, {
dataset$Y <- rpois(nrow(dataset), lambda = dataset$Lambda)
model <- glm(Y ~ 0 + factor(Lambda), data = dataset, family = "poisson")
boxplot(residuals(model) ~ dataset$Lambda)
})
# 200 observations per category. Homoscedasticity.
set.seed(123456)
dataset <- data.frame(
Lambda = rep(c(10, 100), 200)
)
layout(matrix(1:16, nrow=4))
replicate(16, {
dataset$Y <- rpois(nrow(dataset), lambda = dataset$Lambda)
model <- glm(Y ~ 0 + factor(Lambda), data = dataset, family = "poisson")
boxplot(residuals(model) ~ dataset$Lambda)
})
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
2015-05-13 8:04 GMT+02:00 Lauren Catherine Ponisio <lponisio op gmail.com>:
> Dear R-sig-mixed-models list members,
>
> I have a glmm with some pretty severe heteroscedasticity in the random
> effects and was hoping for some advice on how to model it, or alternative
> approaches.
>
> I have floral richness data from 18 sites over two years, four observations
> within each year at each site. I am interested in the effect of landscape
> diversity and fire severity. I have model with an interaction between
> landscape diversity and fire severity, and a fixed effect of year, day of
> the year, and a quadratic day of the year. Also a random effect of site.
> The continuous variables are scaled. I assumed a poisson error distribution
> (I checked for overdispersion).
>
> This is my model:
>
> FloralRichness ~ s.simpson.div * SiteStatus + Year + s.doy +
> I(s.doy^2) + (1 | Site)
>
> The data is available here:
> https://www.dropbox.com/s/efdcxz20l3dpwvl/alldata.csv?dl=0
>
> ## my code
> formula.flower <- as.formula(FloralRichness ~ s.simpson.div * SiteStatus +
> Year + s.doy +
> I(s.doy^2) + (1 | Site))
>
> flower.mod <- glmer(formula.flower, family="poisson", data=dat.mods)
>
> ## Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) [
> ## glmerMod]
> ## Family: poisson ( log )
> ## Formula: FloralRichness ~ s.simpson.div * SiteStatus + Year + s.doy +
> ## I(s.doy^2) + (1 | Site)
> ## Data: dat.mods
>
> ## AIC BIC logLik deviance df.resid
> ## 688.7 718.4 -334.4 668.7 134
>
> ## Scaled residuals:
> ## Min 1Q Median 3Q Max
> ## -2.77658 -0.39078 -0.02645 0.35389 1.83799
>
> ## Random effects:
> ## Groups Name Variance Std.Dev.
> ## Site (Intercept) 0.04093 0.2023
> ## Number of obs: 144, groups: Site, 18
>
> ## Fixed effects:
> ## Estimate Std. Error z value Pr(>|z|)
> ## (Intercept) 2.39930 0.11243 21.340 < 2e-16 ***
> ## s.simpson.div 0.32838 0.11193 2.934 0.003347 **
> ## SiteStatusLOW -0.36443 0.13975 -2.608 0.009115 **
> ## SiteStatusHIGH -0.03427 0.14189 -0.242 0.809143
> ## Year2014 -0.09582 0.06714 -1.427 0.153517
> ## s.doy 0.05077 0.03658 1.388 0.165188
> ## I(s.doy^2) -0.09105 0.02764 -3.294 0.000986 ***
> ## s.simpson.div:SiteStatusLOW -0.01073 0.15090 -0.071 0.943336
> ## s.simpson.div:SiteStatusHIGH -0.40605 0.14121 -2.876 0.004034 **
> ## ---
> ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> ## residuals don't look great and also the site random effects are all
> ## over the place!
>
> layout(matrix(1:4, nrow=2))
> hist(residuals(flower.mod))
> boxplot(residuals(flower.mod) ~ Site,
> data = dat.mods, main = "Site", ylab = "Residuals")
>
> plot(fitted(flower.mod), residuals(flower.mod),
> xlab = "Fitted Values", ylab = "Residuals")
> abline(h=0, lty=2)
> lines(smooth.spline(fitted(flower.mod), residuals(flower.mod)))
>
> ## one thought was to create site-specific variances, so basically
> ## create a variable called observation for each sample at each site,
> ## and nest observation within site. I found a explanation of how to
> ## do this for a fixed effect here:
> ## http://rpubs.com/bbolker/6298
>
> ## and created this model that I think does the trick
>
> as.formula(FloralRichness ~ s.simpson.div * SiteStatus + SiteStatus * Year
> +
> s.doy + I(s.doy^2) + (Site | obs))
>
> ## the model slowly runs, but with 18 sites and not that much data, perhaps
> ## this is not the best approach.
>
> ## is modeling the heteroscedasticity of the random effects a fruitful
> ## path to follow? Or have I made a more fundamental misstep?
>
> Thanks,
> Lauren
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list