[R] missing values in party::ctree
Andrew Ziem
AZiem at us.ci.org
Thu Feb 17 20:23:19 CET 2011
After ctree builds a tree, how would I determine the direction missing values follow by examining the BinaryTree-class object? For instance in the example below Bare.nuclei has 16 missing values and is used for the first split, but the missing values are not listed in either set of factors. (I have the same question for missing values among numeric [non-factor] values, but I assume the answer is similar.)
> require(party)
> require(mlbench)
> data(BreastCancer)
> BreastCancer$Id <- NULL
> ct <- ctree(Class ~ . , data=BreastCancer, controls = ctree_control(maxdepth = 1))
> ct
Conditional inference tree with 2 terminal nodes
Response: Class
Inputs: Cl.thickness, Cell.size, Cell.shape, Marg.adhesion, Epith.c.size, Bare.nuclei, Bl.cromatin, Normal.nucleoli, Mitoses
Number of observations: 699
1) Bare.nuclei == {1, 2}; criterion = 1, statistic = 488.294
2)* weights = 448
1) Bare.nuclei == {3, 4, 5, 6, 7, 8, 9, 10}
3)* weights = 251
> sum(is.na(BreastCancer$Bare.nuclei))
[1] 16
> nodes(ct, 1)[[1]]$psplit
Bare.nuclei == {1, 2}
> nodes(ct, 1)[[1]]$ssplit
list()
Based on below, the answer is node 2, but I don't see it in the object.
> sum(BreastCancer$Bare.nuclei %in% c(1,2,NA))
[1] 448
> sum(BreastCancer$Bare.nuclei %in% c(1,2))
[1] 432
> sum(BreastCancer$Bare.nuclei %in% c(3:10))
[1] 251
Andrew
More information about the R-help
mailing list