[R] Random Forest with highly imbalanced data
Kel
lamkelj at yahoo.com
Wed May 12 20:38:19 CEST 2004
Hi group,
I am trying to do a RF with approx 250,000
cases. My objective is to determine the risk factors
of a person being readmitted to hospital (response=1)
or else (response=0). Only 10%, or 25,000 cases were
readmitted. I've heard about down-sampling and class
weight approach and am wondering if R can do it. Even
some reference to articles will help.
>From the statistical point of view, is there any rule
of thumb of the positive/negative response ratio so
that adjustment has to be applied?
Thank you so much.
Regards,
Kelvin
More information about the R-help
mailing list