[R] logistic regression model with non-integer weights

Michael Dewey info at aghmed.fsnet.co.uk
Wed Apr 12 19:35:21 CEST 2006

At 17:12 09/04/06, Ramón Casero Cañas wrote:

I have not seen a reply to this so far apologies if I missed something.

>When fitting a logistic regression model using weights I get the
>following warning
> > data.model.w <- glm(ABN ~ TR, family=binomial(logit), weights=WEIGHT)
>Warning message:
>non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
>Details follow
>I have a binary dependent variable of abnormality
>ABN = T, F, T, T, F, F, F...
>and a continous predictor
>TR = 1.962752 1.871123 1.893543 1.685001 2.121500, ...
>As the number of abnormal cases (ABN==T) is only 14%, and there is large
>overlapping between abnormal and normal cases, the logistic regression
>found by glm is always much closer to the normal cases than for the
>abnormal cases. In particular, the probability of abnormal is at most 0.4.
>             Estimate Std. Error z value Pr(>|z|)
>(Intercept)   0.7607     0.7196   1.057   0.2905
>TR2          -1.4853     0.4328  -3.432   0.0006 ***
>I would like to compensate for the fact that the a priori probability of
>abnormal cases is so low. I have created a weight vector

I am not sure what the problem you really want to solve is but it seems that
a) abnormality is rare
b) the logistic regression predicts it to be rare.
If you want a prediction system why not try different cut-offs (other than 
0.5 on the probability scale) and perhaps plot sensitivity and specificity 
to help to choose a cut-off?

> > WEIGHT[ ABN == TRUE ] <-  1 / na / 2
> > WEIGHT[ ABN == FALSE ] <-  1 / nn / 2
>so that all weights add up to 1, where ``na'' is the number of abnormal
>cases, and ``nn'' is the number of normal cases. That is, normal cases
>have less weight in the model fitting because there are so many.
>But then I get the warning message at the beginning of this email, and I
>suspect that I'm doing something wrong. Must weights be integers, or at
>least greater than one?
>Ramón Casero Cañas

Michael Dewey

More information about the R-help mailing list