[R] "glm" function question
Gregor Gorjanc
gregor.gorjanc at bfro.uni-lj.si
Sun Oct 22 03:02:50 CEST 2006
Chris Linton <connect.chris <at> gmail.com> writes:
>
> I am creating a model attempting to predict the probability someone will
> reoffend after being caught for a crime. There are seven total inputs and I
> planned on using a logistic regression. I started with a null deviance of
> 182.91 and ended up with a residual deviance of 83.40 after accounting for
> different interactions and such. However, I realized after that my code is
> different from that in my book. And I can't figure out what I need to put
> in it's place. Here's my code:
>
...
> fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior +
> factor(violence):factor(subst) + factor(violence):factor(educ) +
> factor(violence):factor(age) + factor(violence):factor(prior))
>
> summary(fit1h)
>
> If you noticed, there's no part of my code that looks like:
>
> family=binomial(link="logit"))
>
...
>
> However, when I do this, my null deviance is 1104 and my residual deviance
> is 23460. THIS IS A HUGE DIFFERENCE IN MODEL FIT! I'm not sure if I have
> to redo my model or if my book was simply doing the
> "family=binomial(link="logit")" for a specific problem/reason.
You state that you model the *probability* that ... Then family=gaussian, which
is the default data generation model in glm is not appropriate. Yes, you need to
use family=binomial(link="logit") or family=binomial(link="probit"), but you
also need to take care in proper specification of your y in the glm call.
Gregor
More information about the R-help
mailing list