[R] rpart

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Sep 26 10:56:53 CEST 2006


On Mon, 25 Sep 2006, henrigel at gmx.de wrote:

> Dear r-help-list:
>
> If I use the rpart method like
>
> cfit<-rpart(y~.,data=data,...),
>
> what kind of tree is stored in cfit?
> Is it right that this tree is not pruned at all, that it is the full tree?

It is an rpart object.  This contains both the tree and the instructions 
for pruning it at all values of cp: note that cp is also used in deciding 
how large a tree to grow.

> If so, it's up to me to choose a subtree by using the printcp method.

Or the plotcp method.

> In the technical report from Atkinson and Therneau "An Introduction to 
> recursive partitioning using the rpart routines" from 2000, one can see 
> the following table on page 15:
>
>      CP  nsplit  relerror  xerror   xstd
> 1   0.105   0     1.00000   1.0000   0.108
> 2   0.056   3     0.68519   1.1852   0.111
> 3   0.028   4     0.62963   1.0556   0.109
> 4   0.574   6     0.57407   1.0556   0.109
> 5   0.100   7     0.55556   1.0556   0.109
>
> Some lines below it says "We see that the best tree has 5 terminal nodes 
> (4 splits). Why that if the xerror is the lowest for the tree only 
> consisting of the root?

There are *two* reports with that name: this seems to be from minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was used).
My guess is that the authors excluded the root as not being a tree, but 
only they can answer that.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list