[R] How to validate model?
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Tue Oct 7 22:03:27 CEST 2008
Pedro.Rodriguez at sungard.com wrote:
> Hi Frank,
>
> Thanks for your feedback! But I think we are talking about two different
> things.
>
> 1) Validation: The generalization performance of the classifier. See,
> for example, "Studies on the Validation of Internal Rating Systems" by
> BIS.
I didn't think the desire was for a classifier but instead was for a
risk predictor. If prediction is the goal, classification methods or
accuracy indexes based on classifications do not work very well.
>
> 2) Calibration: Correct calibration of a PD rating system means that the
> calibrated PD estimates are accurate and conform to the observed default
> rates. See, for instance, An Overview and Framework for
> PD Backtesting and Benchmarking, by Castermans et al.
I'm unclear on what you mean here. Correct calibration of a predictive
system means that the UNcalibrated estimates are accurate (i.e., they
don't need any calibration). (What is PD?)
>
> Frank, you are referring the #1 and I am referring to #2.
>
> Nonetheless, I would never create a rating system if my model doesn't
> discriminate better than a coin toss.
For sure
Frank
>
> Regards,
>
> Pedro
>
>
>
>
>
>
>
> -----Original Message-----
> From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu]
> Sent: Tuesday, October 07, 2008 11:02 AM
> To: Rodriguez, Pedro
> Cc: maithili_shiva at yahoo.com; r-help at r-project.org
> Subject: Re: [R] How to validate model?
>
> Pedro.Rodriguez at sungard.com wrote:
>> Usually one validates scorecards with the ROC curve, Pietra Index, KS
>> test, etc. You may be interested in the WP 14 from BIS (www.bis.org).
>>
>> Regards,
>>
>> Pedro
>
> No, the validation should be done using an absolute reliability
> (calibration) curve. You need to verify that at all levels of predicted
>
> risk there is agreement with the true probability of failure. An ROC
> curve does not do that, and I doubt the others do. A
> resampling-corrected loess calibration curve is a good approach as
> implemented in the Design package's calibrate function.
>
> Frank
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Maithili Shiva
>> Sent: Tuesday, October 07, 2008 8:22 AM
>> To: r-help at r-project.org
>> Subject: [R] How to validate model?
>>
>> Hi!
>>
>> I am working on scorecard model and I have arrived at the regression
>> equation. I have used logistic regression using R.
>>
>> My question is how do I validate this model? I do have hold out sample
>> of 5000 customers.
>>
>> Please guide me. Problem is I had never used Logistic regression
> earlier
>> neither I am used to credit scoring models.
>>
>> Thanks in advance
>>
>> Maithili
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list