[R] logistic regression model with non-integer weights

Ramón Casero Cañas 8-T at gmx.net
Sun Apr 9 18:12:30 CEST 2006


When fitting a logistic regression model using weights I get the
following warning

> data.model.w <- glm(ABN ~ TR, family=binomial(logit), weights=WEIGHT)
Warning message:
non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)

Details follow

***

I have a binary dependent variable of abnormality

ABN = T, F, T, T, F, F, F...

and a continous predictor

TR = 1.962752 1.871123 1.893543 1.685001 2.121500, ...



As the number of abnormal cases (ABN==T) is only 14%, and there is large
overlapping between abnormal and normal cases, the logistic regression
found by glm is always much closer to the normal cases than for the
abnormal cases. In particular, the probability of abnormal is at most 0.4.

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   0.7607     0.7196   1.057   0.2905
TR2          -1.4853     0.4328  -3.432   0.0006 ***
---

I would like to compensate for the fact that the a priori probability of
abnormal cases is so low. I have created a weight vector

> WEIGHT <- ABN
> WEIGHT[ ABN == TRUE ] <-  1 / na / 2
> WEIGHT[ ABN == FALSE ] <-  1 / nn / 2

so that all weights add up to 1, where ``na'' is the number of abnormal
cases, and ``nn'' is the number of normal cases. That is, normal cases
have less weight in the model fitting because there are so many.

But then I get the warning message at the beginning of this email, and I
suspect that I'm doing something wrong. Must weights be integers, or at
least greater than one?

Regards,

-- 
Ramón Casero Cañas

http://www.robots.ox.ac.uk/~rcasero/wiki
http://www.robots.ox.ac.uk/~rcasero/blog



More information about the R-help mailing list