[R] Variance explained in regression trees?

Alexander J. Pries apries at ufl.edu
Wed Oct 12 17:01:45 CEST 2005


I apologize for what may be novice questions but I am new to program R and
need a bit of assistance. I am using R to create regression trees to explain
how various environmental predictors influence coastal dune loss as a result
of hurricane activity. 

First question is as follows; how do I interpret the complexity plots that
the rpart package will produce. What do the X and Y axis represent (e.g.,
X-val relative error and cp). My understanding is that "cp" is similar to a
complexity penalty for having a tree with many branches when a simpler one
would be just as robust. How can I use the values and error bars to
interpret what is the "optimal" sized tree?

My other question is as follows; other statistical packages (I'm thinking
specifically of DTREG) that build regression trees are able to produce a
model summary that explains initial variance, amount of variance explained
by the tree, and unexplained variance. From this information, an estimated
R-sqr is calculated that provides some indication of how well the tree
"fits."

Does R produce, or have the ability, to produce information like this? If
anyone has specifics on how I might be able to evaluate the fit of my
regression trees.

Thank you in advance for any helpful guidance!

Alex Pries

--------------------
Alexander Pries
Graduate Student
Wildlife Ecology and Conservation
University of Florida
P.O. Box 110430
Gainesville, FL 32605
apries at ufl.edu
http://plaza.ufl.edu/apries
(352) 246-9621




More information about the R-help mailing list