[R] logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred

peter dalgaard pdalgd at gmail.com
Thu Dec 1 22:24:20 CET 2011


On Dec 1, 2011, at 21:32 , Ben quant wrote:

> Thank you for the feedback, but my data looks fine to me. Please tell me if I'm not understanding.

Hum, then maybe it really is a case of a transition region being short relative to the range of your data. Notice that the warning is just that: a warning. I do notice that the distribution of your x values is rather extreme -- you stated a range of 0--14 and a mean of 0.01. And after all, an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

Have a look at  range(x[y==0]) and range(x[y==1]). 


> 
> I followed your instructions and here is a sample of the first 500 values : (info on 'd' is below that)
> 
> >  d <- as.data.frame(l_yx) 
> > x = with(d, y[order(x)])
> > x[1:500] # I have 1's and 0's dispersed throughout 
>   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
> [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
> 
> # I get the warning still
> > l_df = as.data.frame(l_yx)
> >     l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
> 
> Warning message:
> glm.fit: fitted probabilities numerically 0 or 1 occurred 
> 
> # some info on 'd' above:
> 
> > d[1:500,1]
>   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > d[1:500,2]
>   [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01 0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02 2.080511e-02 1.642404e-01 3.703720e-03  8.901981e-02 1.260415e-03
>  [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03 0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02 2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
>  [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04 1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01 3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
>  [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02 2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02 0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
>  [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02 2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03 1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
>  [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03 9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03 9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
>  [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02 6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02 6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
> [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03 1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02 3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
> [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02 6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03 3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
> [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02 1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03 2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
> [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02 2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02 3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
> [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03 6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02 8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
> [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02 6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02 2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
> [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03 4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04 1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
> [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03 3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03 1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
> [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04 1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02 1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
> [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03 1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02 1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
> [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03 2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03 7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
> [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03 3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02 1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
> [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02 5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03 1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
> [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03 1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02 1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
> [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02 6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03 3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
> [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02 3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04 1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
> [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04 3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03 2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
> [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02 1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02 2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
> [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03 5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02 9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
> [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02 4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02 8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
> [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02 1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02 2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
> [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02 1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02 1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
> [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02 2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03 1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
> [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02 1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02 2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
> [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02 9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02 2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
> [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03 6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02 4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
> [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03
> 
> 
> Thanks for your help,
> 
> Ben
> 
> On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard <pdalgd at gmail.com> wrote:
> 
> On Dec 1, 2011, at 18:54 , Ben quant wrote:
> 
> > Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
> > below did not go through.
> 
> Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below
> 
> [snip]
> >
> > With the above data I do:
> >>    l_logit = glm(y~x, data=as.data.frame(l_yx),
> > family=binomial(link="logit"))
> > Warning message:
> > glm.fit: fitted probabilities numerically 0 or 1 occurred
> >
> > Why am I getting this warning when I have data points of varying values for
> > y=1 and y=0?  In other words, I don't think I have the linear separation
> > issue discussed in one of the links I provided.
> 
> I bet that you do... You can get the warning without that effect (one of my own examples is  the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at
> 
> d <- as.data.frame(l_yx)
> with(d, y[order(x)])
> 
> if it comes out as all zeros followed by all ones or vice versa, then you have the problem.
> 
> 
> >
> > PS - Then I do this and I get a odds ratio a crazy size:
> >>    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
> > $coefficients[2]
> >>    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
> > coeffcients
> >>    l_exp_coef
> >       x
> > 3161.781
> >
> > So for one unit increase in the predictor variable I get 3160.781%
> > (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
> > How do I correct for this issue? (I tried multiplying the predictor
> > variables by a constant and the odds ratio goes down, but the warning above
> > still persists and shouldn't the odds ratio be predictor variable size
> > independent?)
> 
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list