[R] Variance explained in regression trees?
Alexander J. Pries
apries at ufl.edu
Wed Oct 12 17:01:45 CEST 2005
I apologize for what may be novice questions but I am new to program R and
need a bit of assistance. I am using R to create regression trees to explain
how various environmental predictors influence coastal dune loss as a result
of hurricane activity.
First question is as follows; how do I interpret the complexity plots that
the rpart package will produce. What do the X and Y axis represent (e.g.,
X-val relative error and cp). My understanding is that "cp" is similar to a
complexity penalty for having a tree with many branches when a simpler one
would be just as robust. How can I use the values and error bars to
interpret what is the "optimal" sized tree?
My other question is as follows; other statistical packages (I'm thinking
specifically of DTREG) that build regression trees are able to produce a
model summary that explains initial variance, amount of variance explained
by the tree, and unexplained variance. From this information, an estimated
R-sqr is calculated that provides some indication of how well the tree
"fits."
Does R produce, or have the ability, to produce information like this? If
anyone has specifics on how I might be able to evaluate the fit of my
regression trees.
Thank you in advance for any helpful guidance!
Alex Pries
--------------------
Alexander Pries
Graduate Student
Wildlife Ecology and Conservation
University of Florida
P.O. Box 110430
Gainesville, FL 32605
apries at ufl.edu
http://plaza.ufl.edu/apries
(352) 246-9621
More information about the R-help
mailing list