[R] warning in binomial analysis
Ronaldo Reis Jr.
chrysopa at insecta.ufv.br
Mon Jan 20 17:48:02 CET 2003
Em Thomas Lumley, escreveu:
>
> When you fit logistic regression models to fairly sparse data you can
> often have a situation where for some combination of variables the
> response variable is either all 0 or all 1. In that case the maximum
> likelihood estimates for at least some of the coefficients will be
> infinite. That's what R is telling you.
>
> You should be able to tell which coefficients are infinite -- the
> coefficients and their standard errors will be large.
>
> When this happens the standard errors and the p-values reported by
> summary.glm() for those variables are useless.
>
> -thomas
Hi Thomas,
I try to understand this problem. It is very common in ecology data using
binomial or poisson errors.
> summary(abundmod)
x1 y x2 n
Sp1 : 12 Min. : 0.000 Min. : 3.210 Min. : 13.00
Sp10 : 12 1st Qu.: 0.000 1st Qu.: 6.572 1st Qu.: 29.25
Sp11 : 12 Median : 1.000 Median : 8.845 Median : 44.50
Sp12 : 12 Mean : 4.011 Mean :19.417 Mean : 92.25
Sp13 : 12 3rd Qu.: 3.000 3rd Qu.:28.988 3rd Qu.:119.25
Sp14 : 12 Max. :92.000 Max. :60.530 Max. :338.00
(Other):204
> m <- glm(y/n~x1*x2,family=binomial,weights=n,maxit=1000)
Warning message:
fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt))
glm.fit.null else glm.fit)(x = X, y = Y,
I tell the levels which coefficients are infinite.
x1Sp22 18.9024041 44.4228068 0.426 0.670464
x1Sp5 22.0655076 42.1371974 0.524 0.600516
I look the dataset to understand why these two levels are ""wrongs"".
Both appear alone in one value of x2.
but,
some others levels appear alone in some level of x2. Look:
> tapply(y,list(x1,x2),c)
3.21 4.05 5.56 6.91 7.97 8.56 9.13 16.13 25.58 39.21 46.16 60.53
Sp1 2 0 9 6 6 4 8 21 20 17 60 18
Sp10 4 1 3 1 0 2 2 19 13 7 12 5
Sp11 2 0 0 1 4 0 1 4 8 5 19 6
Sp12 0 1 1 1 5 0 0 1 13 3 7 6
Sp13 0 0 0 1 0 0 0 0 0 1 1 0
Sp14 0 0 0 2 1 0 1 1 13 0 3 3
Sp15 0 0 1 0 0 0 0 0 1 0 0 0
Sp16 2 4 3 6 8 2 8 0 14 4 21 5
Sp17 0 0 0 0 0 0 0 0 0 0 5 0
Sp18 5 4 12 1 9 3 5 36 40 27 92 52
Sp19 0 0 1 1 0 0 2 0 2 0 0 0
Sp2 2 0 1 0 0 0 0 3 1 0 2 1
Sp20 2 2 2 2 5 4 3 2 13 37 77 29
Sp21 0 0 0 0 0 0 1 0 0 0 0 0
Sp22 1 0 0 0 0 0 0 0 0 0 0 0
Sp23 0 0 0 1 0 0 0 0 0 0 0 0
Sp3 2 0 5 6 3 2 6 16 22 7 21 9
Sp4 0 1 0 1 0 0 0 1 3 0 6 1
Sp5 2 0 0 0 0 0 0 0 0 0 0 0
Sp6 0 0 0 0 3 1 2 6 12 0 7 0
Sp7 0 0 0 0 0 0 0 0 6 0 1 0
Sp8 0 0 0 1 0 0 0 1 3 2 3 2
Sp9 0 0 0 0 4 0 2 2 8 3 1 1
sp17, sp21, sp23 appear for some one value of x2. Why the problem is just with
Sp22 and Sp5?
It is a problem in my dataset? I need to remove these levels? What is the
correct mean? How to resolve this?
Thanks
Ronaldo
--
Windows 98: quanto mais bonito é o espetáculo, maior é a confusão
nos bastidores...
--
| //|\\ [*****************************][*******************]
|| ( õ õ ) [Ronaldo Reis Júnior ][PentiumIII-600 ]
| V [ESALQ/USP-Entomologia, CP-09 ][HD: 30 + 10 Gb ]
|| / l \ [13418-900 Piracicaba - SP ][RAM: 128 Mb ]
| /(lin)\ [Fone: 19-429-4199 r.229 ][Video: SiS620-8Mb ]
||/(linux)\ [chrysopa at insecta.ufv.br ][Modem: Pctel-onboar]
|/ (linux) \[ICQ#: 5692561 ][Kernel: 2.4.18 ]
|| ( x ) [*****************************][*******************]
||| _/ \_Powered by Gnu/Debian Woody D+:) | Lxuser#: 205366
More information about the R-help
mailing list