[R] F-Tests in generalized linear mixed models (GLMM)

Douglas Bates bates at stat.wisc.edu
Wed Nov 19 15:54:49 CET 2008


On Wed, Nov 19, 2008 at 6:55 AM, Björn Stollenwerk
<bjoern.stollenwerk at helmholtz-muenchen.de> wrote:
> Hi!

> I would like to perform an F-Test over more than one variable within a
> generalized mixed model with Gamma-distribution
> and log-link function.

Are you using the phrase "F-Test" as a general term for model
comparison techniques like the analysis of variance or as a specific
type of test based on ratios of mean squares and the F distribution?
If the latter then you may need to reconsider your question.  The F
statistic is derived from a normal (i.e. Gaussian) distribution of the
response vector.  In certain balanced cases it can also be applied to
linear mixed models.  As far as I know there is not a derivation of
the F statistic from a Gamma distribution, either with or without
random effects.

> For this purpose, I use the package mgcv.
> Similar tests may be done using the function "anova", as for example in the
> case of a normal
> distributed response. However, if I do so, the error message
> "error in eval(expr, envir, enclos) : object "fixed" not found" occures.
> Does anyone konw why, or how to fix the problem? To illustrate the problem,
> I send the output of a simulated example.
> Thank you very much in advance.
>
> Best regards, Björn
>
> Example:
>
>> # simulation of data
>> n <- 300
>> x1 <- sample(c(T,F), n, replace=TRUE)
>> x2 <- rnorm(n)
>> random1 <- sample(c("level1","level2","level3"), n, replace=TRUE)
>> true.lp <- 5 + 1.1*x1 + 0.16 * x2
>> mu <- exp(true.lp)
>> sigma <- mu * 1
>> a <- mu^2/sigma^2
>> s <- sigma^2/mu
>> y <- rgamma(n, shape=a, scale=s)
>>
>> library(mgcv)
>>
>> # a mixed model without Gamma-distribution and without log-link works as
>> follows:
>> glmm1 <- gamm(y ~ x1 + x2, random=list(random1 = ~1))
>> glmm2 <- gamm(y ~ 1, random=list(random1 = ~1))
>>
>> anova(glmm1$lme)
> numDF denDF F-value p-value
> X 3 295 103.4730 <.0001
>> anova(glmm2$lme, glmm1$lme)
> Model df AIC BIC logLik Test L.Ratio p-value
> glmm2$lme 1 3 4340.060 4351.172 -2167.030
> glmm1$lme 2 5 4292.517 4311.036 -2141.258 1 vs 2 51.54367 <.0001
>>
>> # a linear model also works, though no p-value is reported
>> glm1 <- gam(y ~ x1 + x2)
>> glm2 <- gam(y ~ 1)
>> anova(glm1)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> y ~ x1 + x2
>
> Parametric Terms:
> df F p-value
> x1 1 45.58 7.69e-11
> x2 1 13.96 0.000224
>
>> anova(glm2, glm1)
> Analysis of Deviance Table
>
> Model 1: y ~ 1
> Model 2: y ~ x1 + x2
> Resid. Df Resid. Dev Df Deviance
> 1 299 33024943
> 2 297 27811536 2 5213407
>>
>> # general linear models (GLM) with Gamma and log-link don't work
>> glm1.gamma <- gam(y ~ x1 + x2, family=Gamma(link="log"))
>> glm2.gamma <- gam(y ~ 1, family=Gamma(link="log"))
>> anova(glm1.gamma)
>
> Family: Gamma
> Link function: log
>
> Formula:
> y ~ x1 + x2
>
> Parametric Terms:
> df F p-value
> x1 1 59.98 1.52e-13
> x2 1 16.06 7.78e-05
>
>> anova(glm2.gamma, glm1.gamma)
> Analysis of Deviance Table
>
> Model 1: y ~ 1
> Model 2: y ~ x1 + x2
> Resid. Df Resid. Dev Df Deviance
> 1 299 413.59
> 2 297 343.90 2 69.69
>>
>> # neither do general linear mixed models (GLMM)
>>
>> glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1),
>> family=Gamma(link="log"))
>
> Maximum number of PQL iterations: 20
> iteration 1
>> glm2.gamma <- gamm(y ~ 1, random=list(random1 = ~1),
>> family=Gamma(link="log"))
>
> Maximum number of PQL iterations: 20
> iteration 1
>> summary(glm1.gamma$lme)
> Linear mixed-effects model fit by maximum likelihood
> Data: data
> AIC BIC logLik
> 847.722 866.241 -418.861
>
> Random effects:
> Formula: ~1 | random1
> (Intercept) Residual
> StdDev: 2.954058e-05 0.9775214
>
> Variance function:
> Structure: fixed weights
> Formula: ~invwt
> Fixed effects: list(fixed)
> Value Std.Error DF t-value p-value
> X(Intercept) 5.066376 0.08363392 295 60.57801 0e+00
> Xx1TRUE 0.884486 0.11421762 295 7.74387 0e+00
> Xx2 0.234537 0.05851689 295 4.00802 1e-04
> Correlation:
> X(Int) X1TRUE
> Xx1TRUE -0.733
> Xx2 -0.008 0.085
>
> Standardized Within-Group Residuals:
> Min Q1 Med Q3 Max
> -1.0207671 -0.6911364 -0.2899184 0.3665161 4.9603830
>
> Number of Observations: 300
> Number of Groups: 3
>> summary(glm1.gamma$gam)
>
> Family: Gamma
> Link function: log
>
> Formula:
> y ~ x1 + x2
>
> Parametric coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 5.06638 0.08363 60.578 < 2e-16 ***
> x1TRUE 0.88449 0.11422 7.744 1.53e-13 ***
> x2 0.23454 0.05852 4.008 7.75e-05 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>
> R-sq.(adj) = 0.171 Scale est. = 0.95555 n = 300
>>
>> anova(glm1.gamma$lme)
> numDF denDF F-value p-value
> X 3 295 3187.192 <.0001
>> anova(glm2.gamma$lme, glm1.gamma$lme)
> Fehler in eval(expr, envir, enclos) : objekt "fixed" nicht gefunden
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list