[R] Interpretation of output from glm
Pedro de Barros
pbarros at ualg.pt
Tue Nov 8 15:47:16 CET 2005
I am fitting a logistic model to binary data. The response variable is a
factor (0 or 1) and all predictors are continuous variables. The main
predictor is LT (I expect a logistic relation between LT and the
probability of being mature) and the other are variables I expect to modify
this relation.
I want to test if all predictors contribute significantly for the fit or not
I fit the full model, and get these results
> summary(HMMaturation.glmfit.Full)
Call:
glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
family = binomial(link = "logit"), data = HMIndSamples)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.0983 -0.7620 0.2540 0.7202 2.0292
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 *
LT 5.372e-02 1.798e-02 2.987 0.00281 **
CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 ***
Biom -1.375e-02 2.005e-03 -6.856 7.07e-12 ***
LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 ***
LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 10272.4 on 8224 degrees of freedom
Residual deviance: 7185.8 on 8219 degrees of freedom
AIC: 7197.8
Number of Fisher Scoring iterations: 8
However, when I run anova on the fit, I get
> anova(HMMaturation.glmfit.Full, test='Chisq')
Analysis of Deviance Table
Model: binomial, link: logit
Response: Mature
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 8224 10272.4
LT 1 2873.8 8223 7398.7 0.0
CondF 1 0.1 8222 7398.5 0.7
Biom 1 0.2 8221 7398.3 0.7
LT:CondF 1 142.1 8220 7256.3 9.413e-33
LT:Biom 1 70.4 8219 7185.8 4.763e-17
Warning message:
fitted probabilities numerically 0 or 1 occurred in: method(x = x[, varseq
<= i, drop = FALSE], y = object$y, weights = object$prior.weights,
I am having a little difficulty interpreting these results.
The result from the fit tells me that all predictors are significant, while
the anova indicates that besides LT (the main variable), only the
interaction of the other terms is significant, but the main effects are not.
I believe that in the first output (on the glm object), the significance of
all terms is calculated considering each of them alone in the model (i.e.
removing all other terms), while the anova output is (as it says)
considering the sequential addition of the terms.
So, there are 2 questions:
a) Can I tell that the interactions are significant, but not the main effects?
b) Is it legitimate to consider a model where the interactions are
considered, but not the main effects CondF and Biom?
Thanks for any help,
Pedro
More information about the R-help
mailing list