[R] logistic regression model + Cross-Validation
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sun Jan 21 15:54:00 CET 2007
nitin jindal wrote:
> Hi,
>
> I am trying to cross-validate a logistic regression model.
> I am using logistic regression model (lrm) of package Design.
>
> f <- lrm( cy ~ x1 + x2, x=TRUE, y=TRUE)
> val <- validate.lrm(f, method="cross", B=5)
val <- validate(f, ...) # .lrm not needed
>
> My class cy has values 0 and 1.
>
> "val" variable will give me indicators like slope and AUC. But, I also need
> the vector of predicted values of class variable "cy" for each record while
> cross-validation, so that I can manually look at the results. So, is there
> any way to get those probabilities assigned to each class.
>
> regards,
> Nitin
No, validate.lrm does not have that option. Manually looking at the
results will not be easy when you do enough cross-validations. A single
5-fold cross-validation does not provide accurate estimates. Either use
the bootstrap or repeat k-fold cross-validation between 20 and 50 times.
k is often 10 but the optimum value may not be 10. Code for averaging
repeated cross-validations is in
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
along with simulations of bootstrap vs. a few cross-validation methods
for binary logistic models.
Frank
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list