[R] validation logistic regression
Frank E Harrell Jr
f.harrell at Vanderbilt.Edu
Wed May 26 15:24:55 CEST 2010
Better would be 100 repeats of 10-fold cross-validation, or
bootstrapping, as implemented in the rms package.
Frank
On 05/26/2010 08:21 AM, azam jaafari wrote:
>
> Hi
>
> Thank you for your reply.
>
> I'm new in R. So I'm slow
>
> If I want to do leave-one-out cross validation with these data(100), how I tell R that omit one by one data? Is validationsize=100?
>
> Thanks alot
>
> Azam
>
> --- On Wed, 5/26/10, Joris Meys<jorismeys at gmail.com> wrote:
>
>
> From: Joris Meys<jorismeys at gmail.com>
> Subject: Re: [R] validation logistic regression
> To: "azam jaafari"<azamjaafari at yahoo.com>
> Cc: r-help at r-project.org
> Date: Wednesday, May 26, 2010, 5:00 AM
>
>
> Hi,
>
> first of all, you shouldn't backtransform your prediction, use the option type=response instead :
>
> salichpred<-predict(salic.lr, newdata=profilevalidation,type="response")
>
> limit<- 0.5
> salichpredcat<- ifelse(salichpred<limit,0,1) # prediction of categories.
>
> Read in on sensitivity, specificity and ROC-curves. With changing the limit, you can calculate sensitivity and specificity, and you can construct a ROC curve that will tell you how well your predictions are. It all depends on how much error you allow on the predictions.
>
> Cheers
> Joris
>
>
>
> On Wed, May 26, 2010 at 10:04 AM, azam jaafari<azamjaafari at yahoo.com> wrote:
>
> Hi
>
> I did validation for prediction by logistic regression according to following:
>
> validationsize<- 23
> set.seed(1)
> random<-runif(123)
> order(random)
> nrprofilesinsample<-sort(order(random)[1:100])
> profilesample<- data[nrprofilesinsample,]
> profilevalidation<- data[-nrprofilesinsample,]
> salich<-profilesample$SALIC.H.1
> salic.lr<-glm(salich~wetnessindex, profilesample, family=binomial('logit'))
> summary(salic.lr)
> salichpred<-predict(salic.lr, newdata=profilevalidation)
> expsalichpred<-exp(salichpred)
> salichprediction<-(expsalichpred/(1+expsalichpred))
>
> So,
> table(salichprediction, profilevalidation$SALIC.H.1)
>
> in result:
> salichprediction 0 1
> 0.0408806327422231 1 0
> 0.094509645033899 1 0
> 0.118665480273383 1 0
> 0.129685441514168 1 0
> 0.13545295569511 1 0
> 0.137580612201769 1 0
> 0.197265822234215 1 0
> 0.199278585548248 0 1
> 0.202436276322278 1 0
> 0.211278767985746 1 0
> 0.261036846823867 1 0
> 0.283792703256058 1 0
> 0.362229486187581 0 1
> 0.362795636267779 1 0
> 0.409067386115694 1 0
> 0.410860613509484 0 1
> 0.423960962956254 1 0
> 0.428164288793652 1 0
> 0.448509687866763 0 1
> 0.538401659478058 0 1
> 0.557282539294224 1 0
> 0.603881788227797 0 1
> 0.63633478460736 0 1
>
> So, I have salichprediction between 0 to 1 and binary variable(observed values) 0 or 1. I want to compare these data together and I want to know is ok this model(logistic regression) for prediction or no?
>
> please help me?
>
> Thanks alot
>
> Azam
>
>
>
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list