[R] validation logistic regression
Frank E Harrell Jr
f.harrell at Vanderbilt.Edu
Wed May 26 14:23:31 CEST 2010
On 05/26/2010 07:00 AM, Joris Meys wrote:
> Hi,
>
> first of all, you shouldn't backtransform your prediction, use the option
> type=response instead :
>
> salichpred<-predict(salic.lr, newdata=profilevalidation,type="response")
>
> limit<- 0.5
> salichpredcat<- ifelse(salichpred<limit,0,1) # prediction of categories.
>
> Read in on sensitivity, specificity and ROC-curves. With changing the limit,
> you can calculate sensitivity and specificity, and you can construct a ROC
> curve that will tell you how well your predictions are. It all depends on
> how much error you allow on the predictions.
>
> Cheers
> Joris
If you want to use split-sample validation, your validation sample is
perhaps 100 times too small.
There are more direct ways to validate predictions than using
sensitivity, specificity, and ROC, for example smooth calibration curves
and various indexes of predictive accuracy. These are implemented in
the rms package. See the validate.lrm and calibrate.lrm functions.
Frank
>
>
> On Wed, May 26, 2010 at 10:04 AM, azam jaafari<azamjaafari at yahoo.com>wrote:
>
>> Hi
>>
>> I did validation for prediction by logistic regression according to
>> following:
>>
>> validationsize<- 23
>> set.seed(1)
>> random<-runif(123)
>> order(random)
>> nrprofilesinsample<-sort(order(random)[1:100])
>> profilesample<- data[nrprofilesinsample,]
>> profilevalidation<- data[-nrprofilesinsample,]
>> salich<-profilesample$SALIC.H.1
>> salic.lr<-glm(salich~wetnessindex, profilesample,
>> family=binomial('logit'))
>> summary(salic.lr)
>> salichpred<-predict(salic.lr, newdata=profilevalidation)
>> expsalichpred<-exp(salichpred)
>> salichprediction<-(expsalichpred/(1+expsalichpred))
>>
>> So,
>> table(salichprediction, profilevalidation$SALIC.H.1)
>>
>> in result:
>> salichprediction 0 1
>> 0.0408806327422231 1 0
>> 0.094509645033899 1 0
>> 0.118665480273383 1 0
>> 0.129685441514168 1 0
>> 0.13545295569511 1 0
>> 0.137580612201769 1 0
>> 0.197265822234215 1 0
>> 0.199278585548248 0 1
>> 0.202436276322278 1 0
>> 0.211278767985746 1 0
>> 0.261036846823867 1 0
>> 0.283792703256058 1 0
>> 0.362229486187581 0 1
>> 0.362795636267779 1 0
>> 0.409067386115694 1 0
>> 0.410860613509484 0 1
>> 0.423960962956254 1 0
>> 0.428164288793652 1 0
>> 0.448509687866763 0 1
>> 0.538401659478058 0 1
>> 0.557282539294224 1 0
>> 0.603881788227797 0 1
>> 0.63633478460736 0 1
>>
>> So, I have salichprediction between 0 to 1 and binary variable(observed
>> values) 0 or 1. I want to compare these data together and I want to know is
>> ok this model(logistic regression) for prediction or no?
>>
>> please help me?
>>
>> Thanks alot
>>
>> Azam
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list