[R] Logistic Regression - Interpreting SENS (Sensitivity) and SPEC (Specificity)

Frank E Harrell Jr f.harrell at vanderbilt.edu
Mon Oct 13 18:26:26 CEST 2008


Dieter Menne wrote:
> Maithili Shiva <maithili_shiva <at> yahoo.com> writes:
> 
>> I havd main sample of 42500 clentes and
>> based on their status as regards to defaulted / non - defaulted, I have
> genereted the probability of default.
>> I have a hold out sample of 5000 clients. I have calculated (1) No of
> correctly classified goods Gg, (2) No of
>> correcly classified Bads Bg and also (3) number of wrongly classified bads
> (Gb) and (4) number of wrongly
>> classified goods (Bg).
> 
> The simple and wrong answer is to use these data directly to compute sensitivity
> (fraction of hits). This measure is useless, but I encounter it often in medical
> publications.

Exactly.  Using classification accuracy, sensitivity, specificity means 
that you are not using the model's predicted probabilities in a 
reasonable or powerful way.  Credit scoring models need to demonstrate 
absolute calibration accuracy.

Frank

> 
> You can get a more reasonable answer by using cross-validation. Check, for
> example, Frank Harrell's 
> 
> http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
> 
> Dieter
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list