[R] GLMM - Am I trying the impossible?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Aug 18 12:21:26 CEST 2005
It is not supported to call anova() on a glmmPQL fit.
For the glmmPQL fit you show, you have very large parameter estimates for
a logistic and have partial separation (as you comment on for the control
group): in that case PQL is not a reasonable method.
Try
fit <- glm(dead ~ Parasite * Bacteria, data=fish.data, family=binomial)
summary(fit)
anova(fit, test="Chisq")
fitted(fit)
and you have fitted values of zero (up to numerical tolerances).
This *is* discussed in MASS, around pp.198-9.
So there is little point in adding random efects for that model. Now try
fit2 <- glmmPQL(dead ~ Parasite + Bacteria, random=~1|Tank,
family=binomial, data=fish.data)
summary(fit2)
Fixed effects: dead ~ Bacteria + Parasite
Value Std.Error DF t-value p-value
(Intercept) -4.243838 0.7519194 150 -5.644007 0.0000
Parasiteyes 1.264102 0.4205313 7 3.005964 0.0198
Bacteriayes 2.850640 0.7147180 7 3.988483 0.0053
which is pretty similar to the lmer fit you show.
I don't know what anova is doing for your lmer fit, but I do know that it
should not be working with sums of squares as is being reported.
On Thu, 18 Aug 2005, Pedro J. Aphalo wrote:
> Dear all,
>
> I have tried to calculate a GLMM fit with lmer (lme4) and glmmPQL
> (MASS), I also used glm for comparison.
I think you missed what glm was trying to tell you.
> I am getting very different results from different functions, and I
> suspect that the problem is with our dataset rather than the functions,
> but I would appreciate help in deciding whether my suspicions are right.
> If indeed we are attempting the wrong type of analysis, some guidance
> about what would be the right thing to do would be greatly appreciated.
>
> The details:
> The data:
> The data are from the end-point of a survival experiment with fish. The
> design of the experiment is a 2 x 2 factorial, with each factor
> (Bacteria and Parasite) at two levels (yes and no). There were 16 fish
> in each tank, and the treatment was applied to the whole tank. There
> were in all 10 tanks (160 fish), with 2 tanks for controls (no/no), 2
> tanks for (Parasite:yes/Bacteria:no) and 3 tanks for each of the other 2
> treatments. A dead fish was considered a success, and a binomial family
> with the default logit link was used in the fits. No fish died in the
> control treatment (Is this the problem?).
>
> Overall "probabilities" as dead/total for the four treatments were:
> Paras Bact Prob
> no no 0
> yes no 0.0625
> no yes 0.208
> yes yes 0.458
>
> We are interested in testing main effects and interaction, but the
> interaction is of special interest to us.
>
> The data for "dead" are coded as 0/1 with 1 indicating a dead fish, and
> the file has one row per fish.
>
> Some results:
>
> lme4 (ver 0.98-1, R 2.1.1, Windows XP)
> ~~~~
>
> > fish1.lmerPQL <- lmer(dead ~ Parasite * Bacteria + (1|Tank),
> data=fish.data, family=binomial)
> Error in lmer(dead ~ Parasite * Bacteria + (1 | Tank), data = fish.data, :
> Unable to invert singular factor of downdated X'X
> In addition: Warning message:
> Leading minor of size 4 of downdated X'X is indefinite
> >
>
> without the interaction:
> > fish3.lmerPQL <- lmer(dead ~ Parasite + Bacteria + (1|Tank),
> data=fish.data, family=binomial)
> > anova(fish3.lmerPQL)
> Analysis of Variance Table
> Df Sum Sq Mean Sq Denom F value Pr(>F)
> Parasite 1 0.012 0.012 157.000 0.0103 0.9192
> Bacteria 1 0.058 0.058 157.000 0.0524 0.8192
> > summary(fish3.lmerPQL)
> Generalized linear mixed model fit using PQL
> Formula: dead ~ Parasite + Bacteria + (1 | Tank)
> Data: fish.data
> Family: binomial(logit link)
> AIC BIC logLik deviance
> 141.3818 156.7577 -65.69091 131.3818
> Random effects:
> Groups Name Variance Std.Dev.
> Tank (Intercept) 5e-10 2.2361e-05
> # of obs: 160, groups: Tank, 10
>
> Estimated scale (compare to 1) 0.9318747
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -4.24380 0.79924 -5.3098 1.098e-07 ***
> Parasiteyes 1.26407 0.44694 2.8283 0.0046801 **
> Bacteriayes 2.85062 0.75970 3.7523 0.0001752 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Correlation of Fixed Effects:
> (Intr) Prstys
> Parasiteyes -0.429
> Bacteriayes -0.898 0.093
>
> Very different P-values from anova and summary.
>
> MASS: (ver 7.2-17, R 2.1.1, Windows XP)
> ~~~~~
>
> > summary(fish1.glmmpql)
> Linear mixed-effects model fit by maximum likelihood
> Data: fish.data
> AIC BIC logLik
> 1236.652 1255.103 -612.326
>
> Random effects:
> Formula: ~1 | Tank
> (Intercept) Residual
> StdDev: 0.02001341 0.8944214
>
> Variance function:
> Structure: fixed weights
> Formula: ~invwt
> Fixed effects: dead ~ Parasite * Bacteria
> Value Std.Error DF t-value p-value
> (Intercept) -18.56607 1044.451 150 -0.01777591 0.9858
> Parasiteyes 15.85802 1044.451 6 0.01518311 0.9884
> Bacteriayes 17.23107 1044.451 6 0.01649772 0.9874
> Parasiteyes:Bacteriayes -14.69007 1044.452 6 -0.01406487 0.9892
> Correlation:
> (Intr) Prstys Bctrys
> Parasiteyes -1
> Bacteriayes -1 1
> Parasiteyes:Bacteriayes 1 -1 -1
>
> Standardized Within-Group Residuals:
> Min Q1 Med Q3 Max
> -1.028634e+00 -5.734674e-01 -2.886770e-01 -4.224474e-14 4.330155e+00
>
> Number of Observations: 160
> Number of Groups: 10
> > anova(fish1.glmmpql)
> numDF denDF F-value p-value
> (Intercept) 1 150 17.452150 <.0001
> Parasite 1 6 4.136142 0.0882
> Bacteria 1 6 12.740212 0.0118
> Parasite:Bacteria 1 6 0.000198 0.9892
> >
>
> > anova(glmmPQL(dead~Bacteria*Parasite, random=~1|Tank,
> family=binomial, data=fish.data))
> iteration 1
> numDF denDF F-value p-value
> (Intercept) 1 150 17.452150 <.0001
> Bacteria 1 6 8.980833 0.0241
> Parasite 1 6 7.895521 0.0308
> Bacteria:Parasite 1 6 0.000198 0.9892
> >
>
> Now anova indicates significance, but summary gives huge P-values.
>
> I have looked in MASS, ISwR, Fox's and Crawley's book, for hints, but I
> have probably missed the right spots/books. Hints about what to read
> will be also appreciated.
>
> Many thanks in advance, and sorry about the long message.
>
> The report on this research is obviously not yet published, but if the
> dataframe would be of help, it can be found as saved from R at:
> http://people.cc.jyu.fi/~aphalo/R_fish/fish.Rda. (Use load to read it
> into R).
>
> Pedro.
> --
> ==================================================================
> Pedro J. Aphalo
> Department of Biological and Environmental Science
> University of Jyväskylä
> P.O. Box 35, 40351 JYVÄSKYLÄ, Finland
> Phone +358 14 260 2339
> Mobile +358 50 3721504
> Fax +358 14 260 2321
> mailto:pedro.aphalo at cc.jyu.fi
> http://www.jyu.fi/~aphalo/ ,,,^..^,,,
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list