[R] Question about rpart decision trees (being used to predict customer churn)
Carlos J. Gil Bellosta
cgb at datanalytics.com
Sat Aug 1 21:24:06 CEST 2009
Hello,
If you do
my.tree <- rpart(cancel ~ experience)
and then you check
my.tree$frame
you will note that the complexity parameter there is 0.
Check ?rpart.object to get a description of what this output means. But
essentially, you will not be able to break the leaf unless you set a
complexity parameter below that value, this is, never.
You may need to go into the internals of the function (and the C code)
in order to understand how this parameter is calculated. It looks to me
as an oddity and it is worth trying to understand why.
Best regards,
Carlos J. Gil Bellosta
http://www.datanalytics.com
P.S.: Note that there is a bug in your submitted code that requires some
hand fixing.
On Sun, 2009-07-26 at 11:37 -0700, Robert Smith wrote:
> Hi,
>
> I am using rpart decision trees to analyze customer churn. I am finding that
> the decision trees created are not effective because they are not able to
> recognize factors that influence churn. I have created an example situation
> below. What do I need to do to for rpart to build a tree with the variable
> experience? My guess is that this would happen if rpart used the loss matrix
> while creating the tree.
>
> > experience <- as.factor(c(rep("good",90), rep("bad",10)))
> > cancel <- as.factor(c(rep("no",85), rep("yes",5), rep("no",5),
> rep("yes",5)))
> > table(experience, cancel)
> cancel
> experience no yes
> bad 5 5
> good 85 5
> > rpart(cancel ~ experience)
> n= 100
> node), split, n, loss, yval, (yprob)
> * denotes terminal node
> 1) root 100 10 no (0.9000000 0.1000000) *
>
> I tried the following commands with no success.
> rpart(cancel ~ experience, control=rpart.control(cp=.0001))
> rpart(cancel ~ experience, parms=list(split='information'))
> rpart(cancel ~ experience, parms=list(split='information'),
> control=rpart.control(cp=.0001))
> rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,10000,0), nrow=2,
> ncol=2)))
>
> Thanks a lot for your help.
>
> Best regards,
> Robert
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list