[R-SIG-Finance] logistic regression:weights and unbalanced samples

Andre Guimaraes alsguimaraes at gmail.com
Wed Apr 27 00:17:39 CEST 2011


Greetings from Rio de Janeiro, Brazil.

I am looking for advice / references on binary logistic regression
with weighted least squares (using lrm & weights), on the following
context:

1) unbalanced sample (n0=10000, n1=700);
2) sampling weights used to rebalance the sample (w0=1, w1=14.29); e
3) after modelling, adjust the intercept in order to reflect the
expected % of 1’s in the population (e.g., circa 7%, as opposed to
50%).

I have identified references that deal with the last point, but no
conclusive article or book dealing with this specific use of weights
in unbalaced samples.

The area under the ROC is about 0.70, and the estimated probabilities
are close to the frequencies of 1’s in different ranges, which looks
satisfactory. Hosmer & Lemeshow’s test is not significant, as
expected.

Can someone comment on the adopted strategy, or suggest some specific
bibliography that might address the issue of weights and unbalanced
samples in logistic regression?

Thanks in advance,

André Guimarães



More information about the R-SIG-Finance mailing list