[R] Question on cross-validation in rpart

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Nov 2 15:45:23 CET 2006


On Thu, 2 Nov 2006, Brian Sanborn wrote:

> Hi R folks,
>
> I am using R version 2.2.1 for Unix. I am exploring the rpart function,
> in particular the rpart.control parameter. I have tried using different
> values for xval (0, 1, 10, 20) leaving other parameters constant but I
> receive the same tree after each run. Is the10 fold cross-validation
> default still running every time? I would expect the trees to change at
> least a little when I change the number of folds in the cross-validation
> but this is not the case in my tests. Any advice would be greatly
> appreciated.

Why do you expect that?

1) The tree returned is not pruned, and the cross-validation affects other 
information in the rpart object.

2) The cross-validation is used in choosing the cost-complexity parameter 
cp, that is the degree of pruning to be applied via e.g. printcp or 
plotcp.

I think you need to study the documentation about using rpart, either its 
technical reports or MASS chapter 9.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list