[R] validation logistic regression

Frank E Harrell Jr f.harrell at Vanderbilt.Edu
Wed May 26 14:23:31 CEST 2010


On 05/26/2010 07:00 AM, Joris Meys wrote:
> Hi,
>
> first of all, you shouldn't backtransform your prediction, use the option
> type=response instead :
>
> salichpred<-predict(salic.lr, newdata=profilevalidation,type="response")
>
> limit<- 0.5
> salichpredcat<- ifelse(salichpred<limit,0,1) # prediction of categories.
>
> Read in on sensitivity, specificity and ROC-curves. With changing the limit,
> you can calculate sensitivity and specificity, and you can construct a ROC
> curve that will tell you how well your predictions are. It all depends on
> how much error you allow on the predictions.
>
> Cheers
> Joris

If you want to use split-sample validation, your validation sample is 
perhaps 100 times too small.

There are more direct ways to validate predictions than using 
sensitivity, specificity, and ROC, for example smooth calibration curves 
and various indexes of predictive accuracy.  These are implemented in 
the rms package.  See the validate.lrm and calibrate.lrm functions.

Frank

>
>
> On Wed, May 26, 2010 at 10:04 AM, azam jaafari<azamjaafari at yahoo.com>wrote:
>
>> Hi
>>
>> I did validation for prediction by logistic regression according to
>> following:
>>
>> validationsize<- 23
>> set.seed(1)
>> random<-runif(123)
>> order(random)
>> nrprofilesinsample<-sort(order(random)[1:100])
>> profilesample<- data[nrprofilesinsample,]
>> profilevalidation<- data[-nrprofilesinsample,]
>> salich<-profilesample$SALIC.H.1
>> salic.lr<-glm(salich~wetnessindex, profilesample,
>> family=binomial('logit'))
>> summary(salic.lr)
>> salichpred<-predict(salic.lr, newdata=profilevalidation)
>> expsalichpred<-exp(salichpred)
>> salichprediction<-(expsalichpred/(1+expsalichpred))
>>
>> So,
>>   table(salichprediction, profilevalidation$SALIC.H.1)
>>
>> in result:
>> salichprediction            0 1
>>    0.0408806327422231 1 0
>>    0.094509645033899  1 0
>>    0.118665480273383  1 0
>>    0.129685441514168  1 0
>>    0.13545295569511    1 0
>>    0.137580612201769  1 0
>>    0.197265822234215  1 0
>>    0.199278585548248  0 1
>>    0.202436276322278  1 0
>>    0.211278767985746  1 0
>>    0.261036846823867  1 0
>>    0.283792703256058  1 0
>>    0.362229486187581  0 1
>>    0.362795636267779  1 0
>>    0.409067386115694  1 0
>>    0.410860613509484  0 1
>>    0.423960962956254  1 0
>>    0.428164288793652  1 0
>>    0.448509687866763  0 1
>>    0.538401659478058  0 1
>>    0.557282539294224  1 0
>>    0.603881788227797  0 1
>>    0.63633478460736   0 1
>>
>> So, I have salichprediction between 0 to 1 and binary variable(observed
>> values) 0 or 1. I want to compare these data together and I want to know is
>> ok this model(logistic regression) for prediction or no?
>>
>> please help me?
>>
>> Thanks alot
>>
>> Azam
>>
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>


-- 
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list