[R] Random Forest AUC
mxkuhn
mxkuhn at gmail.com
Sat Oct 23 15:39:06 CEST 2010
I think the issue is that you really can't use the training set to judge this (without resampling).
For example, k nearest neighbors are not known to over fit, but a 1nn model will always perfectly predict the training data.
Max
On Oct 23, 2010, at 9:05 AM, "Liaw, Andy" <andy_liaw at merck.com> wrote:
> What Breiman meant is that as the model gets more complex (i.e., as the
> number of trees tends to infinity) the geneeralization error (test set
> error) does not increase. This does not hold for boosting, for example;
> i.e., you can't "boost forever", which nececitate the need to find the
> optimal number of iterations. You don't need that with RF.
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of vioravis
>> Sent: Saturday, October 23, 2010 12:15 AM
>> To: r-help at r-project.org
>> Subject: Re: [R] Random Forest AUC
>>
>>
>> Thanks Max and Andy. If the Random Forest is always giving an
>> AUC of 1, isn't
>> it over fitting??? If not, how do you differentiate this from over
>> fitting??? I believe Random forests are claimed to never over
>> fit (from the
>> following link).
>>
>> http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.ht
>> m#features
>>
>>
>> Ravishankar R
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Random-Forest-AUC-tp3006649p3008157.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> Notice: This e-mail message, together with any attachme...{{dropped:11}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list