[R] Interpretation of output from glm
Pedro de Barros
pbarros at ualg.pt
Wed Nov 9 16:45:06 CET 2005
Dear John,
Thanks for the quick reply. I did indeed have these ideas, but somehow
"floating", and all I could find about this mentioned categorical
predictors. Can you suggest a good book where I could try to learn more
about this?
Thanks again,
Pedro
At 01:49 09/11/2005, you wrote:
>Dear Pedro,
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro de Barros
> > Sent: Tuesday, November 08, 2005 9:47 AM
> > To: r-help at stat.math.ethz.ch
> > Subject: [R] Interpretation of output from glm
> > Importance: High
> >
> > I am fitting a logistic model to binary data. The response
> > variable is a factor (0 or 1) and all predictors are
> > continuous variables. The main predictor is LT (I expect a
> > logistic relation between LT and the probability of being
> > mature) and the other are variables I expect to modify this relation.
> >
> > I want to test if all predictors contribute significantly for
> > the fit or not I fit the full model, and get these results
> >
> > > summary(HMMaturation.glmfit.Full)
> >
> > Call:
> > glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
> > family = binomial(link = "logit"), data = HMIndSamples)
> >
> > Deviance Residuals:
> > Min 1Q Median 3Q Max
> > -3.0983 -0.7620 0.2540 0.7202 2.0292
> >
> > Coefficients:
> > Estimate Std. Error z value Pr(>|z|)
> > (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 *
> > LT 5.372e-02 1.798e-02 2.987 0.00281 **
> > CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 ***
> > Biom -1.375e-02 2.005e-03 -6.856 7.07e-12 ***
> > LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 ***
> > LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 ***
> > ---
> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > (Dispersion parameter for binomial family taken to be 1)
> >
> > Null deviance: 10272.4 on 8224 degrees of freedom
> > Residual deviance: 7185.8 on 8219 degrees of freedom
> > AIC: 7197.8
> >
> > Number of Fisher Scoring iterations: 8
> >
> > However, when I run anova on the fit, I get >
> > anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of
> > Deviance Table
> >
> > Model: binomial, link: logit
> >
> > Response: Mature
> >
> > Terms added sequentially (first to last)
> >
> >
> > Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> > NULL 8224 10272.4
> > LT 1 2873.8 8223 7398.7 0.0
> > CondF 1 0.1 8222 7398.5 0.7
> > Biom 1 0.2 8221 7398.3 0.7
> > LT:CondF 1 142.1 8220 7256.3 9.413e-33
> > LT:Biom 1 70.4 8219 7185.8 4.763e-17
> > Warning message:
> > fitted probabilities numerically 0 or 1 occurred in: method(x
> > = x[, varseq <= i, drop = FALSE], y = object$y, weights =
> > object$prior.weights,
> >
> >
> > I am having a little difficulty interpreting these results.
> > The result from the fit tells me that all predictors are
> > significant, while
> > the anova indicates that besides LT (the main variable), only the
> > interaction of the other terms is significant, but the main
> > effects are not.
> > I believe that in the first output (on the glm object), the
> > significance of
> > all terms is calculated considering each of them alone in the
> > model (i.e.
> > removing all other terms), while the anova output is (as it says)
> > considering the sequential addition of the terms.
> >
> > So, there are 2 questions:
> > a) Can I tell that the interactions are significant, but not
> > the main effects?
>
>In a model with this structure, the "main effects" represent slopes over the
>origin (i.e., where the other variables in the product terms are 0), and
>aren't meaningfully interpreted as main effects. (Is there even any data
>near the origin?)
>
> > b) Is it legitimate to consider a model where the interactions are
> > considered, but not the main effects CondF and Biom?
>
>Generally, no: That is, such a model is interpretable, but it places strange
>constraints on the regression surface -- that the CondF and Biom slopes are
>0 over the origin.
>
>None of this is specific to logistic regression -- it applies generally to
>generalized linear models, including linear models.
>
>I hope this helps,
> John
More information about the R-help
mailing list