[R] How to estimate whether overfitting?

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon May 10 03:13:03 CEST 2010


On Sun, May 9, 2010 at 11:53 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On May 9, 2010, at 9:20 AM, bbslover wrote:
>
>>
>> 1. is there some criterion to estimate overfitting?  e.g. R2 and Q2 in the
>> training set, as well as R2 in the test set, when means overfitting.   for
>> example,  in my data, I have R2=0.94 for the training set and  for the
>> test
>> set R2=0.70, is overfitting?
>> 2. in this scatter, can one say this overfitting?
>>
>> 3. my result is obtained by svm, and the sample are 156 and 52 for the
>> training and test sets, and predictors are 96,   In this case, can svm be
>> employed to perform prediction?   whether the number of the predictors are
>> too many ?
>>
>
> I think you need to buy a copy of Hastie, Tibshirani, and Friedman and do
> some self-study of chapters 7 and 12.

And you don't even have to buy it before you can start studying since
the PDF is available here:
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Having a hard cover is always handy, tho ..
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list