[R] logistic regression model + Cross-Validation
Weiwei Shi
helprhelp at gmail.com
Tue Jan 23 02:30:31 CET 2007
why not use lda{MASS} and it has cv=T option; it does "loo", though.
Or use randomForest.
if you have to use lrm, then the following code might help:
n.fold <- 5 # 5-fold cv
n.sample <- 50 # assumed 50 samples
s <- sample(1:n.fold, size=n.sample, replace=T)
for (i in 1:n.fold){
# create your training data and validation data for each fold
trn <- YOURWHOLEDATAFRAME[s!=i,]
val <- YOURWHOLEDATAFRAME[s==i,]
# now do your own modeling using lrm
# todo
}
HTH,
weiwei
On 1/21/07, nitin jindal <nitin.jindal at gmail.com> wrote:
> If validate.lrm does not has this option, do any other function has it.
> I will certainly look into your advice on cross validation. Thnx.
>
> nitin
>
> On 1/21/07, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:
> >
> > nitin jindal wrote:
> > > Hi,
> > >
> > > I am trying to cross-validate a logistic regression model.
> > > I am using logistic regression model (lrm) of package Design.
> > >
> > > f <- lrm( cy ~ x1 + x2, x=TRUE, y=TRUE)
> > > val <- validate.lrm(f, method="cross", B=5)
> >
> > val <- validate(f, ...) # .lrm not needed
> >
> > >
> > > My class cy has values 0 and 1.
> > >
> > > "val" variable will give me indicators like slope and AUC. But, I also
> > need
> > > the vector of predicted values of class variable "cy" for each record
> > while
> > > cross-validation, so that I can manually look at the results. So, is
> > there
> > > any way to get those probabilities assigned to each class.
> > >
> > > regards,
> > > Nitin
> >
> > No, validate.lrm does not have that option. Manually looking at the
> > results will not be easy when you do enough cross-validations. A single
> > 5-fold cross-validation does not provide accurate estimates. Either use
> > the bootstrap or repeat k-fold cross-validation between 20 and 50 times.
> > k is often 10 but the optimum value may not be 10. Code for averaging
> > repeated cross-validations is in
> > http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
> > along with simulations of bootstrap vs. a few cross-validation methods
> > for binary logistic models.
> >
> > Frank
> > --
> > Frank E Harrell Jr Professor and Chair School of Medicine
> > Department of Biostatistics Vanderbilt University
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
More information about the R-help
mailing list