[R] Difference between "tree" and "rpart"

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 4 18:04:02 CEST 2005

rpart does much more at C level, including pruning and cross-validation so 
can be much faster.

It is also user-extensible.

tree was actually written to track down bugs in the then S implementation, 
and so is much closer to the functionality in S.  It is not where I would 
have started from.  It is really only available for R to support MASS and 
PRNN (my books).

On Wed, 4 May 2005, Dr Carbon wrote:

> In the help for rpart it says, "This differs from the tree function
> mainly in its handling of surrogate variables." And it says that an
> rpart object is a superset of a tree object. Both cite Brieman et al.
> 1984. Both call external code which looks like martian poetry to me.
> I've seen posts in the archives where BDR, and other knowledgeable
> folks, have said that rpart() is to be prefered over tree()
> Is there a simple reason why? They use the same fundamental algorithm.
> Are there differences in processing time? Bells and whistles?

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list