[R] Gini's Importance Value Variable = Inf
Melanie Vida
mvida at mitre.org
Wed Mar 23 21:58:35 CET 2005
Hi All,
In the script below, the importance measure for column 4 (ie
MeanDecreaseGini) indicated "Inf" for V7.
Running the getTree command showed that "V7" had been selected at least
twice in one of the trees for Random Forest. So the "Inf" command was
not generated as a result of dividing the sum of the decreases by 0.
Any suggestions on what may be causing the Inf in "V7" would be helpful?
Thanks in advance,
-Melanie
---------i
library(randomForest)
credit<-read.csv(url("ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data"),
header=FALSE, na.string="?")
credit.rf <- randomForest(V16~., credit, imp=T,
do.trace=100,na.action=na.omit)
imp <- round(importance(credit.rf), 2)
imp
- + MeanDecreaseAccuracy MeanDecreaseGini
V1 0.00 0.00 0.00 0.00
V2 0.75 0.25 0.55 19.92
V3 0.41 0.57 0.46 22.13
V4 0.39 0.33 0.33 4.93
V5 0.26 0.24 0.21 0.60
V6 0.39 0.50 0.40 -46.21
V7 0.91 0.59 0.71 Inf
V8 1.35 1.35 1.06 37.15
V9 0.00 0.00 0.00 0.00
V10 0.00 0.00 0.00 0.00
V11 1.65 1.59 1.23 49.16
V12 0.00 0.00 0.00 0.00
V13 -0.11 -0.10 -0.10 0.21
V14 0.82 0.57 0.66 20.71
V15 1.36 1.02 1.01 33.47
getTree(credit.rf, 1)
left daughter right daughter split var split point status prediction
[1,] 2 3 15 492.0000 1 0
[2,] 4 5 11 2.5000 1 0
[3,] 6 7 2 38.5000 1 0
[4,] 8 9 14 83.0000 1 0
[5,] 10 11 7 207.0000 1 0
[6,] 12 13 11 0.5000 1 0
[7,] 0 0 0 0.0000 -1 2
[8,] 14 15 7 117.0000 1 0
[9,] 16 17 8 3.0625 1 0
[10,] 18 19 3 0.2700 1 0
[11,] 0 0 0 0.0000 -1 2
[12,] 20 21 15 4753.0000 1 0
[13,] 22 23 2 37.0850 1 0
[14,] 24 25 14 8.5000 1 0
More information about the R-help
mailing list