[R] rpart

Fri Jun 4 09:59:52 CEST 2004

Hello everyone,

I'm a newbie to R and to CART so I hope my questions don't seem too stupid.

1.)
My first question concerns the rpart() method. Which method does rpart use in
order to get the best split - entropy impurity, Bayes error (min. error) or Gini
index? Is there a way to make it use the entropy impurity?

The second and third question concern the output of the printcp() function.
2.)
What exactly are the cps in that sense here? I assumed them to be the treshold
complexity parameters as in Breiman et al., 1998, Section 3.3? Are they the same
as the treshold niveaus of alpha? I have read somewhere that the cps here are
the  treshold alphas divided by the root node error. Is that true?

3.)
How is rel error computed?
I am supposed to evaluate the goodness of classification of of the CART method.
Do you think rel error is a good measure for that?

I'd be very thankful if anyone could give me hand on that. This is a project for
uni and I desperately need a good mark.

Thank you very much in advance,

Mareike