[R] partykit ctree: minbucket and case weights
Henric Winell
nilsson.henric at gmail.com
Fri May 30 09:39:45 CEST 2014
Amber Dawn Nolder wrote 2014-05-28 23:16:
>
> Hello,
> I am an R novice, and I am using the "partykit" package to create
> regression trees. I used the following to generate the trees:
> ctree(y~x1+x2+x3+x4,data=my_data,control=ctree_control(testtype =
> "Bonferroni", mincriterion = 0.90, minsplit = 12, minbucket = 4,
> majority = TRUE)
> I thought that "minbucket" set the minimum value for the sum of weights
> in each terminal node, and that each case weight is 1, unless otherwise
> specified. In which case, the sum of case weights in a node should equal the
> number of cases (n) in that node. However, I sometimes obtain a tree with
> a terminal node that contains fewer than 4 cases.
I do agree that the tree below looks suspicious. You may have found a
bug.
But you didn't provide "commented, minimal, self-contained, reproducible
code", i.e., we're missing your 'my_data' object, and therefore we
cannot reproduce this easily. Can you please provide us with the output
from 'dput(my_data)'?
> My data set has a total of 36 cases. The dependent and all independent
> variables are continuous data. Variables x1 and x2 contain missing (NA)
> values.
I tried a few other data sets and there the results seem to come out OK
(even after inducing NAs).
> Could someone please explain why I am getting these results?
Probably. But you need to provide a reproducible example and the
details obtained by 'sessionInfo()'.
As per the posting guide, since this is a contributed package you should
first contact its maintainer (Torsten Hothorn, CC'd) and only post here
if you get no reply. Did you try contacting Torsten?
> Am I mistaken about the value of case weights or about the use of minbucket
> to restrict the size of a terminal node?
I don't think you're mistaken since '?ctree_control' says that
"minbucket: the minimum sum of weights in a terminal node."
Henric
> This is an example of the output:
> Model formula:
> y ~ x1 + x2 + x3 + x4
> Fitted party:
> [1] root
> | [2] x4 <= 30: 0.927 (n = 17, err = 1.1)
> | [3] x4 > 30
> | | [4] x2 <= 43: 0.472 (n = 8, err = 0.4)
> | | [5] x2 > 43
> | | | [6] x3 <= 0.4: 0.282 (n = 3, err = 0.0)
> | | | [7] x3 > 0.4: 0.020 (n = 8, err = 0.0)
> Number of inner nodes: 3
> Number of terminal nodes: 4
> Many thanks!
> Amber Nolder
> Graduate Student
> Indiana University of Pennsylvania
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list