[R] warning in binomial analysis

Ronaldo Reis Jr. chrysopa at insecta.ufv.br
Mon Jan 20 17:48:02 CET 2003


Em Thomas Lumley, escreveu:
>
> When you fit logistic regression models to fairly sparse data you can
> often have a situation where for some combination of variables the
> response variable is either all 0 or all 1.  In that case the maximum
> likelihood estimates for at least some of the coefficients will be
> infinite.  That's what R is telling you.
>
> You should be able to tell which coefficients are infinite -- the
> coefficients and their standard errors will be large.
>
> When this happens the standard errors and the p-values reported by
> summary.glm() for those variables are useless.
>
> 	-thomas

Hi Thomas,

I try to understand this problem. It is very common in ecology data using 
binomial or poisson errors.

> summary(abundmod)
       x1            y                x2               n         
 Sp1    : 12   Min.   : 0.000   Min.   : 3.210   Min.   : 13.00  
 Sp10   : 12   1st Qu.: 0.000   1st Qu.: 6.572   1st Qu.: 29.25  
 Sp11   : 12   Median : 1.000   Median : 8.845   Median : 44.50  
 Sp12   : 12   Mean   : 4.011   Mean   :19.417   Mean   : 92.25  
 Sp13   : 12   3rd Qu.: 3.000   3rd Qu.:28.988   3rd Qu.:119.25  
 Sp14   : 12   Max.   :92.000   Max.   :60.530   Max.   :338.00  
 (Other):204                                                     

> m <- glm(y/n~x1*x2,family=binomial,weights=n,maxit=1000)
Warning message: 
fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt)) 
glm.fit.null else glm.fit)(x = X, y = Y,  


I tell the levels which coefficients are infinite.

x1Sp22      18.9024041 44.4228068   0.426 0.670464    
x1Sp5       22.0655076 42.1371974   0.524 0.600516    

I look the dataset to understand why these two levels are ""wrongs"".

Both appear alone in one value of x2.

but,

some others levels appear alone in some level of x2. Look:

> tapply(y,list(x1,x2),c)
     3.21 4.05 5.56 6.91 7.97 8.56 9.13 16.13 25.58 39.21 46.16 60.53
Sp1     2    0    9    6    6    4    8    21    20    17    60    18
Sp10    4    1    3    1    0    2    2    19    13     7    12     5
Sp11    2    0    0    1    4    0    1     4     8     5    19     6
Sp12    0    1    1    1    5    0    0     1    13     3     7     6
Sp13    0    0    0    1    0    0    0     0     0     1     1     0
Sp14    0    0    0    2    1    0    1     1    13     0     3     3
Sp15    0    0    1    0    0    0    0     0     1     0     0     0
Sp16    2    4    3    6    8    2    8     0    14     4    21     5
Sp17    0    0    0    0    0    0    0     0     0     0     5     0
Sp18    5    4   12    1    9    3    5    36    40    27    92    52
Sp19    0    0    1    1    0    0    2     0     2     0     0     0
Sp2     2    0    1    0    0    0    0     3     1     0     2     1
Sp20    2    2    2    2    5    4    3     2    13    37    77    29
Sp21    0    0    0    0    0    0    1     0     0     0     0     0
Sp22    1    0    0    0    0    0    0     0     0     0     0     0
Sp23    0    0    0    1    0    0    0     0     0     0     0     0
Sp3     2    0    5    6    3    2    6    16    22     7    21     9
Sp4     0    1    0    1    0    0    0     1     3     0     6     1
Sp5     2    0    0    0    0    0    0     0     0     0     0     0
Sp6     0    0    0    0    3    1    2     6    12     0     7     0
Sp7     0    0    0    0    0    0    0     0     6     0     1     0
Sp8     0    0    0    1    0    0    0     1     3     2     3     2
Sp9     0    0    0    0    4    0    2     2     8     3     1     1

sp17, sp21, sp23 appear for some one value of x2. Why the problem is just with 
Sp22 and Sp5?

It is a problem in my dataset? I need to remove these levels? What is the 
correct mean? How to resolve this?

Thanks
Ronaldo

-- 
Windows 98: quanto mais bonito é o espetáculo, maior é a confusão
nos bastidores...
--
|   //|\\   [*****************************][*******************]
|| ( õ õ )  [Ronaldo Reis Júnior          ][PentiumIII-600     ]
|     V     [ESALQ/USP-Entomologia, CP-09 ][HD: 30 + 10 Gb     ]
||  / l \   [13418-900 Piracicaba - SP    ][RAM: 128 Mb        ]
|  /(lin)\  [Fone: 19-429-4199 r.229      ][Video: SiS620-8Mb  ]
||/(linux)\ [chrysopa at insecta.ufv.br      ][Modem: Pctel-onboar]
|/ (linux) \[ICQ#: 5692561                ][Kernel: 2.4.18     ]
||  ( x )   [*****************************][*******************]
||| _/ \_Powered by Gnu/Debian Woody D+:) | Lxuser#: 205366




More information about the R-help mailing list