[R] Random Forests: Question about R^2

Dimitri Liakhovitski ld7631 at gmail.com
Mon Apr 13 22:35:07 CEST 2009


Andy,
thank you very much!
One clarification question:

If MSE = sum(residuals) / n, then
in the formula (1 - mse / Var(y)) - shouldn't one square mse before
dividing by variance?

Dimitri


On Mon, Apr 13, 2009 at 10:52 AM, Liaw, Andy <andy_liaw at merck.com> wrote:
> MSE is the mean squared residuals.  For the training data, the OOB
> estimate is used (i.e., residual = data - OOB prediction, MSE =
> sum(residuals) / n, OOB prediction is the mean of predictions from all
> trees for which the case is OOB).  It is _not_ the average OOB MSE of
> trees in the forest.
>
> I hope there's no question about how the pseudo R^2 is computed on a
> test set?  If you understand how that's done, I assume the confusion is
> only how the OOB MSE is formed.
>
> Best,
> Andy
>
> From: Dimitri Liakhovitski
>>
>> Dear Random Forests gurus,
>>
>> I have a question about R^2 provided by randomForest (for regression).
>> I don't succeed in finding this information.
>>
>> In the help file for randomForest under "Value" it says:
>>
>> rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
>>
>> Could someone please explain in somewhat more detail how exactly R^2
>> is calculated?
>> Is "mse" mean squared error for prediction?
>> Is "mse" an average of mse's for all trees run on out-of-bag
>> holdout samples?
>> In other words - is this R^2 based on out-of-bag samples?
>>
>> Thank you very much for clarification!
>>
>> --
>> Dimitri Liakhovitski
>> MarketTools, Inc.
>> Dimitri.Liakhovitski at markettools.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> Notice:  This e-mail message, together with any attachments, contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> New Jersey, USA 08889), and/or its affiliates (which may be known
> outside the United States as Merck Frosst, Merck Sharp & Dohme or
> MSD and in Japan, as Banyu - direct contact information for affiliates is
> available at http://www.merck.com/contact/contacts.html) that may be
> confidential, proprietary copyrighted and/or legally privileged. It is
> intended solely for the use of the individual or entity named on this
> message. If you are not the intended recipient, and have received this
> message in error, please notify us immediately by reply e-mail and
> then delete it from your system.
>
>



-- 
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com




More information about the R-help mailing list