[R] interpreting the output of a glm with an ordered categorical predictor.
peter dalgaard
pdalgd at gmail.com
Sat Mar 3 09:30:31 CET 2012
On Mar 3, 2012, at 02:10 , kmuller wrote:
>
> Greetings.
>
> I'm a Master's student working on an analysis of herbivore damage on plants.
> I have a tried running a glm with one categorical predictor (aphid
> abundance) and a binomial response (presence/absence of herbivore damage).
> My predictor has four categories: high, medium, low, and none. I used the
> "ordered" function to sort my categories for a glm.
>
> ah <-
> read.csv("http://depot.northwestern.edu/class/2012WI_PBC_435-0_AND_BIOL_SCI_313/muller/herbivoryEdit.csv")
> ah1<- ah[ah$date=="110810",]
> ah2<-ah[ah$date=="110904",]
>
> aphidOrder <- ordered(ah2$aphidLevelMax,levels=c("none", "low", "med",
> "high"))
>
> ordAph <- glm(chewholebinom~aphidOrder,family=binomial,data=ah2)
>
>
> When I ran the summary for the glm (output pasted below), I could not tell
> which intercept referred to which factor level. My question is, what do .L,
> .Q, and .C mean and how can I relate these factors to my original factors
> (none, low, med, high)?
This is an oddity in the modeling code, dating back about two decades and probably impossible to get rid of at this stage.
What you are seeing is polynomial contrasts: Linear, Quadratic, Cubic, with a slightly peculiar parametrization designed to be orthogonal in balanced designs. In such designs, you can see whether your effect is described by a low order polynomial by checking whether higher order terms are insignificant. Notice that this assumes that you can assign equidistant scores to the groups (i.e. codes 1:4 in your case).
(If the above is gibberish to you, maybe try something like
matplot(contr.poly(15)[,1:4], type="b")
to see the first few polynomial curves.)
The whole thing is somewhat useful; e.g., for "response surface designs" where you systematically vary multiple factors in equidistant steps. However, it is not obvious that it should have been unleashed on all kinds of ordered factors.
I conjecture that you don't actually need it, a plain factor() with the levels in the right order should do.
>
> Thank you for your help,
>
> Katherine
>
> summary(ordAph)
>
> Call:
> glm(formula = chewholebinom ~ aphidOrder, family = binomial,
> data = ah2)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -1.6512 -0.9817 0.7687 0.7687 1.5353
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.05567 0.25097 -0.222 0.8245
> aphidOrder.L -1.36755 0.49366 -2.770 0.0056 **
> aphidOrder.Q 0.36824 0.50195 0.734 0.4632
> aphidOrder.C -0.09840 0.51011 -0.193 0.8470
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 137.99 on 99 degrees of freedom
> Residual deviance: 124.05 on 96 degrees of freedom
> AIC: 132.05
>
> Number of Fisher Scoring iterations: 4
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/interpreting-the-output-of-a-glm-with-an-ordered-categorical-predictor-tp4440383p4440383.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list