[R-SIG-Finance] Random Forest Classifiers

Jeffrey Ryan jeffrey.ryan at lemnica.com
Sun Nov 27 04:02:07 CET 2011


This isn't related to finance.  Part of the reason for separate lists
is to keep the noise to a minimum, as well as direct to where answers
may be best found.

Thanks,
Jeff

On Sat, Nov 26, 2011 at 8:42 PM, Chris Waggoner <chris.is.fun at gmail.com> wrote:
> Momop, I think that would warp the robustness of RF. As I understand it, RF
> averages together the different leaves which are themselves averages.
> Pruning like you're talking about would risk overfitting to your particular
> dataset rather than the data-generating process.
>
> On Sat, Nov 26, 2011 at 6:52 PM, Momop Momop <momop540 at yahoo.com> wrote:
>
>>
>> Apologies as the  mail got sent before completion. Here's the full text
>>
>> I am learning Random Forest and have a basic training question. For my
>> problem, I "derived" various classifiers (var0,var1...var9). They are
>> independent, but the intrinsic values from which they are derived overlap.
>> I get the following data for my RF tree. The question I have is, should I
>> eliminate the number of classifiers that haven't shown enough importance
>> (For example, I could scale %IncMSE relatively and may be just pick the top
>> 3 or 4).
>>
>> -------------------------------
>> %IncMSE    IncNodePurity
>> Var0    10.84632    7.232559
>> var1    24.53021    7.976509
>> var2    26.5005    4.653162
>> var3    60.18863    21.882258
>> var4    11.97568    7.25413
>> var5    49.63468    16.968472
>> var6    19.55981    10.009517
>> var7    10.36669    13.136694
>> var8    14.16585    7.818673
>> var9    9.75812    7.178831
>> -------------------------------
>>
>> Essentially, what I was attempting to do was to choose the best derived
>> classifier by eliminating some from the above list which doesn't show
>> noticeable relative impact on MSE. Any guidance or pointers is much
>> appreciated. Thanks!
>>
>>
>> ________________________________
>>
>> To: "r-sig-finance at r-project.org" <r-sig-finance at r-project.org>
>> Sent: Saturday, November 26, 2011 5:45 PM
>> Subject: [R-SIG-Finance] Random Forest Classifiers
>>
>> I am learning Random Forest and have a basic training question. For my
>> problem, I "derived" various classifiers (var0,var1...var9). They are
>> independent, but the intrinsic values from which they are derived overlap.
>> I get the following data for my RF tree. The question I have is, should I
>> eliminate the number of classifiers that haven't shown enough importance
>> (For example, I could scale %IncMSE relatively and may be just pick the top
>> 3 or 4).
>>
>> -------------------------------
>> %IncMSE    IncNodePurity
>> Var0    10.84632    7.232559
>> var1    24.53021    7.976509
>> var2    26.5005    4.653162
>> var3    60.18863    21.882258
>> var4    11.97568    7.25413
>> var5    49.63468    16.968472
>> var6    19.55981    10.009517
>> var7    10.36669    13.136694
>> var8    14.16585    7.818673
>> var9    9.75812    7.178831
>> -------------------------------
>>
>> [[elided Yahoo spam]]
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-SIG-Finance at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>         [[alternative HTML version deleted]]
>>
>>
>> _______________________________________________
>> R-SIG-Finance at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>



-- 
Jeffrey Ryan
jeffrey.ryan at lemnica.com

www.lemnica.com
www.esotericR.com



More information about the R-SIG-Finance mailing list