[R] Seeking help with LOGIT model

Thu Apr 12 13:37:31 CEST 2012

You should look up the Hauck-Donne phenomenon, which shows that
with binomial GLMs, the standard error can grow faster than
the effect size.  Complete separation results, for example,
when one predictor (or a combination of several predictors)
perfectly predicts the response.  Something like this seems
to be happening for variables 4 and 5.  You could try the
brglm function from the package of the same name, which
uses bias correction.  Compare (after coercing your Data to
a data frame):

summary(glm(Y ~ ., binomial, Data))

Call:
glm(formula = Y ~ ., family = binomial, data = Data)

Deviance Residuals:
      Min        1Q    Median        3Q       Max
-2.00979   0.00000   0.00006   0.27987   1.82302

Coefficients:
               Estimate Std. Error z value Pr(>|z|)
(Intercept)   10.99326   20.77336   0.529   0.5967
`X 1`          0.01943    0.01040   1.868   0.0617 .
`X 2`         10.61013    5.65409   1.877   0.0606 .
`X 3`         -0.66763    0.47668  -1.401   0.1613
`X 4`         70.98785   36.41181   1.950   0.0512 .
`X 5`         17.33126 2872.17069   0.006   0.9952

summary(brglm(Y ~ ., binomial, Data))

Call:
brglm(formula = Y ~ ., family = binomial, data = Data)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)
(Intercept) 12.017791  14.337183   0.838   0.4019
`X 1`        0.014898   0.008263   1.803   0.0714 .
`X 2`        8.307941   4.010792   2.071   0.0383 *
`X 3`       -0.576309   0.352097  -1.637   0.1017
`X 4`       35.627644  16.638766   2.141   0.0323 *
`X 5`        2.134544   2.570756   0.830   0.4064

Good luck.

Ken

Quoting Christofer Bogaso <bogaso.christofer at gmail.com>:

> Thanks Ken for your reply. No doubt your english is quite tough!! I
> understand something is not normal with the 5th explanatory variable
> (se:2872.17069!) However could not understand what you mean by "You
> seem to be getting complete separation on X5 "?
>
> Can you please be more elaborate?
>
> Thanks,
>
> On Thu, Apr 12, 2012 at 4:06 PM, ken knoblauch   
> <ken.knoblauch at inserm.fr> wrote:
>> Christofer Bogaso <bogaso.christofer <at> gmail.com> writes:
>>> Dear all, I am fitting a LOGIT model on this Data...........
>> ---- << snip >>---
>>> glm(Data[,1] ~ Data[,-1], binomial(link = logit))
>>>
>>> Call:  glm(formula = Data[, 1] ~ Data[, -1], family =   
>>> binomial(link = logit))
>>>
>>> Coefficients:
>>>   (Intercept)  Data[, -1]X 1  Data[, -1]X 2  Data[, -1]X 3  Data[,
>>> -1]X 4  Data[, -1]X 5
>>>      10.99326        0.01943       10.61013       -0.66763
>>> 70.98785       17.33126
>>>
>>> Degrees of Freedom: 43 Total (i.e. Null);  38 Residual
>>> Null Deviance:      44.58
>>> Residual Deviance: 17.46        AIC: 29.46
>>> Warning message:
>>> glm.fit: fitted probabilities numerically 0 or 1 occurred
>>>
>>> However I am getting a warning mesage as "fitted probabilities
>>> numerically 0 or 1 occurred". Here my question is,
>> have I made any
>>> mistakes with my above implementation? I
>> s it just because, I have too
>>> less number of '0' in my response Variable?
>>>
>> Look at the output of summary, especially the standard errors.
>> You seem to be getting complete
>> separation on X5 and X4 doesn,'t look so hot either.
>>
>> Ken
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ken Knoblauch
Inserm U846
Stem-cell and Brain Research Institute
Department of Integrative Neurosciences
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr/members/kenneth-knoblauch.html