[R] rpart puzzle
Marc Feldesman
feldesmanm at pdx.edu
Thu Jul 12 23:04:42 CEST 2001
I amend my previous observation. After constructing a very careful
example, rpart works exactly the opposite of CART. In the following split:
x7 < 37 go left
x7 > 37 go right
if x7=37 the case appears to go right. In other words, the split appears
to be of the form:
x7 < 37
x7 >= 37,
which is precisely the opposite form that CART(tm) uses.
Again, I'm not sure what practical difference this makes except that when a
case has a primary splitter that is in an (apparently) excluded part of the
domain, the case goes with the "no" answer to the question. (This is, of
course, obvious if typical 'short-circuit' evaluation is used - because the
value fails the first test (x7 <37) it must obviously go with the
alternative. In CART, the case goes with the "yes" answer. Don't know
what tree does since I don't use it.
In my test example, rpart's behavior results in a misclassification. Had
the test result gone the other way the case gets classified
correctly. Walking the tree demonstrates this quite easily. Also,
changing the value of 37 to 36.9999 produces the correct
classification. (Now I *do* realize that I'm working with floating point
numbers and so "real" 37 may not truly equal "integer" 37, which may
account for *this* anomaly).
Did I have the misfortune to pull an "unknown" with a major primary
splitter occupying an ambiguous part of the domain, or is this a more
significant problem?
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list