[R] logistic regression model with non-integer weights
info at aghmed.fsnet.co.uk
Wed Apr 12 19:35:21 CEST 2006
At 17:12 09/04/06, RamÃ³n Casero CaÃ±as wrote:
I have not seen a reply to this so far apologies if I missed something.
>When fitting a logistic regression model using weights I get the
> > data.model.w <- glm(ABN ~ TR, family=binomial(logit), weights=WEIGHT)
>non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
>I have a binary dependent variable of abnormality
>ABN = T, F, T, T, F, F, F...
>and a continous predictor
>TR = 1.962752 1.871123 1.893543 1.685001 2.121500, ...
>As the number of abnormal cases (ABN==T) is only 14%, and there is large
>overlapping between abnormal and normal cases, the logistic regression
>found by glm is always much closer to the normal cases than for the
>abnormal cases. In particular, the probability of abnormal is at most 0.4.
> Estimate Std. Error z value Pr(>|z|)
>(Intercept) 0.7607 0.7196 1.057 0.2905
>TR2 -1.4853 0.4328 -3.432 0.0006 ***
>I would like to compensate for the fact that the a priori probability of
>abnormal cases is so low. I have created a weight vector
I am not sure what the problem you really want to solve is but it seems that
a) abnormality is rare
b) the logistic regression predicts it to be rare.
If you want a prediction system why not try different cut-offs (other than
0.5 on the probability scale) and perhaps plot sensitivity and specificity
to help to choose a cut-off?
> > WEIGHT <- ABN
> > WEIGHT[ ABN == TRUE ] <- 1 / na / 2
> > WEIGHT[ ABN == FALSE ] <- 1 / nn / 2
>so that all weights add up to 1, where ``na'' is the number of abnormal
>cases, and ``nn'' is the number of normal cases. That is, normal cases
>have less weight in the model fitting because there are so many.
>But then I get the warning message at the beginning of this email, and I
>suspect that I'm doing something wrong. Must weights be integers, or at
>least greater than one?
>RamÃ³n Casero CaÃ±as
More information about the R-help