[R] Interpretation of output from glm

Fri Nov 11 00:47:14 CET 2005

Dear John,

Thanks for the pointers. I will read this.

Pedro
At 14:41 10/11/2005, you wrote:
>Dear Pedro,
>
>The basic point, which relates to the principle of marginality in
>formulating linear models, applies whether the predictors are factors,
>covariates, or both. I think that this is a common topic in books on linear
>models; I certainly discuss it in my Applied Regression, Linear Models, and
>Related Methods.
>
>Regards,
>  John
>
>--------------------------------
>John Fox
>Department of Sociology
>McMaster University
>Hamilton, Ontario
>Canada L8S 4M4
>905-525-9140x23604
>http://socserv.mcmaster.ca/jfox
>--------------------------------
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro de Barros
> > Sent: Wednesday, November 09, 2005 10:45 AM
> > To: r-help at stat.math.ethz.ch
> > Subject: Re: [R] Interpretation of output from glm
> > Importance: High
> >
> > Dear John,
> >
> > Thanks for the quick reply. I did indeed have these ideas,
> > but somehow "floating", and all I could find about this
> > mentioned categorical predictors. Can you suggest a good book
> > where I could try to learn more about this?
> >
> > Thanks again,
> >
> > Pedro
> > At 01:49 09/11/2005, you wrote:
> > >Dear Pedro,
> > >
> > >
> > > > -----Original Message-----
> > > > From: r-help-bounces at stat.math.ethz.ch
> > > > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro de
> > > > Barros
> > > > Sent: Tuesday, November 08, 2005 9:47 AM
> > > > To: r-help at stat.math.ethz.ch
> > > > Subject: [R] Interpretation of output from glm
> > > > Importance: High
> > > >
> > > > I am fitting a logistic model to binary data. The
> > response variable
> > > > is a factor (0 or 1) and all predictors are continuous variables.
> > > > The main predictor is LT (I expect a logistic relation between LT
> > > > and the probability of being
> > > > mature) and the other are variables I expect to modify
> > this relation.
> > > >
> > > > I want to test if all predictors contribute significantly for the
> > > > fit or not I fit the full model, and get these results
> > > >
> > > >  > summary(HMMaturation.glmfit.Full)
> > > >
> > > > Call:
> > > > glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
> > > >      family = binomial(link = "logit"), data = HMIndSamples)
> > > >
> > > > Deviance Residuals:
> > > >      Min       1Q   Median       3Q      Max
> > > > -3.0983  -0.7620   0.2540   0.7202   2.0292
> > > >
> > > > Coefficients:
> > > >                Estimate Std. Error z value Pr(>|z|)
> > > > (Intercept) -8.789e-01  3.694e-01  -2.379  0.01735 *
> > > > LT           5.372e-02  1.798e-02   2.987  0.00281 **
> > > > CondF       -6.763e-02  9.296e-03  -7.275 3.46e-13 ***
> > > > Biom        -1.375e-02  2.005e-03  -6.856 7.07e-12 ***
> > > > LT:CondF     2.434e-03  3.813e-04   6.383 1.74e-10 ***
> > > > LT:Biom      7.833e-04  9.614e-05   8.148 3.71e-16 ***
> > > > ---
> > > > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > > >
> > > > (Dispersion parameter for binomial family taken to be 1)
> > > >
> > > >      Null deviance: 10272.4  on 8224  degrees of freedom Residual
> > > > deviance:  7185.8  on 8219  degrees of freedom
> > > > AIC: 7197.8
> > > >
> > > > Number of Fisher Scoring iterations: 8
> > > >
> > > > However, when I run anova on the fit, I get  >
> > > > anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of
> > Deviance
> > > > Table
> > > >
> > > > Model: binomial, link: logit
> > > >
> > > > Response: Mature
> > > >
> > > > Terms added sequentially (first to last)
> > > >
> > > >
> > > >             Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> > > > NULL                        8224    10272.4
> > > > LT          1   2873.8      8223     7398.7       0.0
> > > > CondF       1      0.1      8222     7398.5       0.7
> > > > Biom        1      0.2      8221     7398.3       0.7
> > > > LT:CondF    1    142.1      8220     7256.3 9.413e-33
> > > > LT:Biom     1     70.4      8219     7185.8 4.763e-17
> > > > Warning message:
> > > > fitted probabilities numerically 0 or 1 occurred in:
> > method(x = x[,
> > > > varseq <= i, drop = FALSE], y = object$y, weights =
> > > > object$prior.weights,
> > > >
> > > >
> > > > I am having a little difficulty interpreting these results.
> > > > The result from the fit tells me that all predictors are
> > > > significant, while the anova indicates that besides LT (the main
> > > > variable), only the interaction of the other terms is
> > significant,
> > > > but the main effects are not.
> > > > I believe that in the first output (on the glm object), the
> > > > significance of all terms is calculated considering each of them
> > > > alone in the model (i.e.
> > > > removing all other terms), while the anova output is (as it says)
> > > > considering the sequential addition of the terms.
> > > >
> > > > So, there are 2 questions:
> > > > a) Can I tell that the interactions are significant, but not the
> > > > main effects?
> > >
> > >In a model with this structure, the "main effects" represent slopes
> > >over the origin (i.e., where the other variables in the
> > product terms
> > >are 0), and aren't meaningfully interpreted as main effects.
> > (Is there
> > >even any data near the origin?)
> > >
> > > > b) Is it legitimate to consider a model where the
> > interactions are
> > > > considered, but not the main effects CondF and Biom?
> > >
> > >Generally, no: That is, such a model is interpretable, but it places
> > >strange constraints on the regression surface -- that the CondF and
> > >Biom slopes are 0 over the origin.
> > >
> > >None of this is specific to logistic regression -- it
> > applies generally
> > >to generalized linear models, including linear models.
> > >
> > >I hope this helps,
> > >  John
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html