[R] Seeking help with LOGIT model
Ken Knoblauch
ken.knoblauch at inserm.fr
Thu Apr 12 13:37:31 CEST 2012
You should look up the Hauck-Donne phenomenon, which shows that
with binomial GLMs, the standard error can grow faster than
the effect size. Complete separation results, for example,
when one predictor (or a combination of several predictors)
perfectly predicts the response. Something like this seems
to be happening for variables 4 and 5. You could try the
brglm function from the package of the same name, which
uses bias correction. Compare (after coercing your Data to
a data frame):
summary(glm(Y ~ ., binomial, Data))
Call:
glm(formula = Y ~ ., family = binomial, data = Data)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.00979 0.00000 0.00006 0.27987 1.82302
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 10.99326 20.77336 0.529 0.5967
`X 1` 0.01943 0.01040 1.868 0.0617 .
`X 2` 10.61013 5.65409 1.877 0.0606 .
`X 3` -0.66763 0.47668 -1.401 0.1613
`X 4` 70.98785 36.41181 1.950 0.0512 .
`X 5` 17.33126 2872.17069 0.006 0.9952
summary(brglm(Y ~ ., binomial, Data))
Call:
brglm(formula = Y ~ ., family = binomial, data = Data)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 12.017791 14.337183 0.838 0.4019
`X 1` 0.014898 0.008263 1.803 0.0714 .
`X 2` 8.307941 4.010792 2.071 0.0383 *
`X 3` -0.576309 0.352097 -1.637 0.1017
`X 4` 35.627644 16.638766 2.141 0.0323 *
`X 5` 2.134544 2.570756 0.830 0.4064
Good luck.
Ken
Quoting Christofer Bogaso <bogaso.christofer at gmail.com>:
> Thanks Ken for your reply. No doubt your english is quite tough!! I
> understand something is not normal with the 5th explanatory variable
> (se:2872.17069!) However could not understand what you mean by "You
> seem to be getting complete separation on X5 "?
>
> Can you please be more elaborate?
>
> Thanks,
>
> On Thu, Apr 12, 2012 at 4:06 PM, ken knoblauch
> <ken.knoblauch at inserm.fr> wrote:
>> Christofer Bogaso <bogaso.christofer <at> gmail.com> writes:
>>> Dear all, I am fitting a LOGIT model on this Data...........
>> ---- << snip >>---
>>> glm(Data[,1] ~ Data[,-1], binomial(link = logit))
>>>
>>> Call: glm(formula = Data[, 1] ~ Data[, -1], family =
>>> binomial(link = logit))
>>>
>>> Coefficients:
>>> (Intercept) Data[, -1]X 1 Data[, -1]X 2 Data[, -1]X 3 Data[,
>>> -1]X 4 Data[, -1]X 5
>>> 10.99326 0.01943 10.61013 -0.66763
>>> 70.98785 17.33126
>>>
>>> Degrees of Freedom: 43 Total (i.e. Null); 38 Residual
>>> Null Deviance: 44.58
>>> Residual Deviance: 17.46 AIC: 29.46
>>> Warning message:
>>> glm.fit: fitted probabilities numerically 0 or 1 occurred
>>>
>>> However I am getting a warning mesage as "fitted probabilities
>>> numerically 0 or 1 occurred". Here my question is,
>> have I made any
>>> mistakes with my above implementation? I
>> s it just because, I have too
>>> less number of '0' in my response Variable?
>>>
>> Look at the output of summary, especially the standard errors.
>> You seem to be getting complete
>> separation on X5 and X4 doesn,'t look so hot either.
>>
>> Ken
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
--
Ken Knoblauch
Inserm U846
Stem-cell and Brain Research Institute
Department of Integrative Neurosciences
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr/members/kenneth-knoblauch.html
More information about the R-help
mailing list