[R] How to validate model?

Wed Oct 8 02:14:50 CEST 2008

Hi,

Yes, from my humble opinion, it doesnt make any sense to use the (2-class) ROC curve for a rating system. For example, if the classifier predicts 100% for all the defaulted exposures and 0% for the good clients, then even though we have a perfect classifier we have a bad rating system. 

However, if we use the multi-class version of Hand and Till (2001), we may test how good is the model to discriminate between classes or ratings. 

Hand, David J. and Robert J. Till, "A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems", Machine Learning, Vol. 45, No. 2, (November 2001), pp. 171-186.

Regards,

Pedro 

-----Original Message-----
From: Ajay ohri [mailto:ohri2007 at gmail.com]
Sent: Tue 10/7/2008 6:46 PM
To: Frank E Harrell Jr
Cc: Rodriguez, Pedro; r-help at r-project.org
Subject: Re: [R] How to validate model?

the purpose of validating indirect measures such as ROC curves.

Biggest Purpose- It is useful while in more marketing /sales meeting context ;)

Also , Deciles specific performance is easy to explain and monitor for faster execution/re modeling.

Regards,

Ajay

On Wed, Oct 8, 2008 at 4:01 AM, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:

	Ajay ohri wrote:

		This is an approach

		Run the model variables on hold out sample.

		Check and compare ROC curves between build and validation datasets.

		Check for changes in parameter estimates (co efficients of variables) p value and signs.

		Check for binning (response versus deciles of individual variables).

		Check concordance, and KS Statistic.
		A decile wise performance of the model in terms of predicted versus actual, rank ordering of deciles, helps in explaining the model to business audience who generally have some business specific input that may require scoring model to be tweaked.

		This assumes multicollinearity, outliers and missing value treatment have already been done, and holdout sample checks for overfitting. You can always rebuild the model using a different random holdout sample.

		A stable model would not change too much.

		In actual implementation , try and build real time triggers for deviations (%) between predicted and actual.

		Regards,

		Ajay

	I wouldn't recommend that approach but legitimate differences of opinion exist on the subject.  In particular I fail to see the purpose of validating indirect measures such as ROC curves.

	Frank

		www.decisionstats.com <http://www.decisionstats.com>

		On Wed, Oct 8, 2008 at 1:33 AM, Frank E Harrell Jr <f.harrell at vanderbilt.edu <mailto:f.harrell at vanderbilt.edu>> wrote:

		   Pedro.Rodriguez at sungard.com <mailto:Pedro.Rodriguez at sungard.com> wrote:

		       Hi Frank,

		       Thanks for your feedback! But I think we are talking about two
		       different
		       things.

		       1) Validation: The generalization performance of the classifier.
		       See,
		       for example, "Studies on the Validation of Internal Rating
		       Systems" by
		       BIS.

		   I didn't think the desire was for a classifier but instead was for a
		   risk predictor.  If prediction is the goal, classification methods
		   or accuracy indexes based on classifications do not work very well.

		       2) Calibration: Correct calibration of a PD rating system means
		       that the
		       calibrated PD estimates are accurate and conform to the observed
		       default
		       rates. See, for instance, An Overview and Framework for
		       PD Backtesting and Benchmarking, by Castermans et al.

		   I'm unclear on what you mean here.  Correct calibration of a
		   predictive system means that the UNcalibrated estimates are accurate
		   (i.e., they don't need any calibration).  (What is PD?)

		       Frank, you are referring the #1 and I am referring to #2.
		       Nonetheless, I would never create a rating system if my model
		       doesn't
		       discriminate better than a coin toss.

		   For sure
		   Frank

		       Regards,

		       Pedro

		       -----Original Message-----
		       From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu
		       <mailto:f.harrell at vanderbilt.edu>] Sent: Tuesday, October 07,
		       2008 11:02 AM
		       To: Rodriguez, Pedro

		       Cc: maithili_shiva at yahoo.com <mailto:maithili_shiva at yahoo.com>;
		       r-help at r-project.org <mailto:r-help at r-project.org>
		       Subject: Re: [R] How to validate model?

		       Pedro.Rodriguez at sungard.com <mailto:Pedro.Rodriguez at sungard.com>
		       wrote:

		           Usually one validates scorecards with the ROC curve, Pietra
		           Index, KS
		           test, etc. You may be interested in the WP 14 from BIS

		           (www.bis.org <http://www.bis.org>).

		           Regards,

		           Pedro

		       No, the validation should be done using an absolute reliability
		       (calibration) curve.  You need to verify that at all levels of
		       predicted

		       risk there is agreement with the true probability of failure.
		        An ROC curve does not do that, and I doubt the others do.  A
		       resampling-corrected loess calibration curve is a good approach
		       as implemented in the Design package's calibrate function.

		       Frank

		           -----Original Message-----
		           From: r-help-bounces at r-project.org
		           <mailto:r-help-bounces at r-project.org>

		       [mailto:r-help-bounces at r-project.org
		       <mailto:r-help-bounces at r-project.org>]

		           On Behalf Of Maithili Shiva
		           Sent: Tuesday, October 07, 2008 8:22 AM

		           To: r-help at r-project.org <mailto:r-help at r-project.org>
		           Subject: [R] How to validate model?

		           Hi!

		           I am working on scorecard model and I have arrived at the
		           regression
		           equation. I have used logistic regression using R.

		           My question is how do I validate this model? I do have hold
		           out sample
		           of 5000 customers.

		           Please guide me. Problem is I had never used Logistic regression

		       earlier

		           neither I am used to credit scoring models.

		           Thanks in advance

		           Maithili

		           ______________________________________________

		           R-help at r-project.org <mailto:R-help at r-project.org> mailing list

		           https://stat.ethz.ch/mailman/listinfo/r-help
		           PLEASE do read the posting guide
		           http://www.R-project.org/posting-guide.html
		           and provide commented, minimal, self-contained, reproducible
		           code.

		           ______________________________________________

		           R-help at r-project.org <mailto:R-help at r-project.org> mailing list

		           https://stat.ethz.ch/mailman/listinfo/r-help
		           PLEASE do read the posting guide

		       http://www.R-project.org/posting-guide.html

		           and provide commented, minimal, self-contained, reproducible
		           code.

		   --    Frank E Harrell Jr   Professor and Chair           School of Medicine
		                       Department of Biostatistics   Vanderbilt University

		   ______________________________________________

		   R-help at r-project.org <mailto:R-help at r-project.org> mailing list

		   https://stat.ethz.ch/mailman/listinfo/r-help
		   PLEASE do read the posting guide
		   http://www.R-project.org/posting-guide.html
		   and provide commented, minimal, self-contained, reproducible code.

		-- 
		Regards,

		Ajay Ohri
		http://tinyurl.com/liajayohri

	-- 
	Frank E Harrell Jr   Professor and Chair           School of Medicine
	                    Department of Biostatistics   Vanderbilt University

-- 
Regards,

Ajay Ohri
http://tinyurl.com/liajayohri