[Rd] ordered factors in tree package - bug? (PR#1025)
murphyk@cs.berkeley.edu
murphyk@cs.berkeley.edu
Sat, 14 Jul 2001 00:10:39 +0200 (MET DST)
This is a multi-part message in MIME format.
--------------FF3F4C9902BE262E184667B2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
I am new to R, and didn't know which list to send this to, since it is a
bug report about a package, not about core R...
I have created a regression tree using 4 predictors: 3 are unordered
(binary) predictors, and the last is a date (integer), which I declare
to be an ordered factor. However, the tree treats the date as if it were
un-ordered, splitting into non-consecutive subsets.
My code
dat <- read.table("/home/cs/murphyk/R/Eugene/102.dat", header=TRUE)
dat$machine <- factor(dat$machine)
dat$TIM <- factor(dat$TIM)
dat$lid <- factor(dat$lid)
#dat$date <- factor(dat$date, ordered=TRUE)
dat$date <- ordered(dat$date)
tr <- tree(TRES ~ ., dat)
produces
1) root 100 0.149300 0.2345
2) TIM: 111 47 0.020650 0.2011
4) date: 9,11 11 0.002673 0.1845 *
5) date: 6,7,8,10,13,14,15,16,17,18,19 36 0.014060 0.2061 *
3) TIM: 222 53 0.029490 0.2642
6) date: 6,7,11,12,14,15,18,19 28 0.010670 0.2539
12) lid: A 6 0.001350 0.2350 *
13) lid: B 22 0.006582 0.2591 *
7) date: 8,9,10,13,16,17 25 0.012620 0.2756
14) machine: 101 14 0.005750 0.2850 *
15) machine: 102 11 0.004055 0.2636 *
and yet
> dat$date
[1] 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 9 9 9 9 9 9 9
9 9 10
[26] 10 10 10 10 10 11 11 11 11 11 11 11 11 12 12 12 13 13 13 13 13 13
13 13 13
[51] 13 13 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 16 16 16
16 16 16
[76] 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 19 19
19 19 19
Levels: 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < 14 < 15 < 16 < 17 < 18 < 19
> is.ordered(dat$date)
[1] TRUE
Also, how do I deal with dates of the form dd/mm/yy, instead of just
integers? (In the above file, I used perl to extract the day, since I
new month and year were constant.)
I have attached the text file 102.dat, so you can easily reproduce the
above bug. I am using R 1.2.3 on linux. For future reference, is it
considered bad form to send attachments?
Kevin
--------------FF3F4C9902BE262E184667B2
Content-Type: text/plain; charset=us-ascii;
name="102.dat"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="102.dat"
date machine TIM lid TRES
6 101 222 B 0.24
6 102 222 B 0.26
6 102 111 B 0.20
6 102 222 B 0.24
6 101 111 A 0.20
7 102 111 A 0.20
7 101 111 B 0.25
7 102 222 B 0.27
7 101 111 A 0.18
7 102 111 A 0.18
7 101 222 B 0.28
7 102 222 B 0.24
8 102 111 B 0.21
8 101 222 B 0.25
8 101 222 B 0.29
9 101 222 B 0.28
9 101 111 A 0.17
9 102 222 A 0.27
9 101 111 A 0.21
9 102 111 A 0.16
9 101 111 A 0.21
9 102 111 A 0.18
9 101 111 A 0.18
9 102 111 A 0.18
10 102 222 B 0.30
10 101 222 B 0.29
10 102 222 A 0.25
10 101 222 B 0.29
10 102 222 B 0.23
10 102 111 A 0.20
11 102 222 B 0.24
11 101 111 A 0.17
11 102 111 A 0.18
11 102 222 A 0.26
11 102 111 A 0.20
11 102 222 B 0.28
11 101 111 A 0.19
11 102 222 A 0.23
12 101 222 B 0.26
12 101 222 B 0.24
12 102 222 B 0.26
13 101 222 B 0.27
13 102 222 B 0.28
13 102 222 B 0.26
13 101 111 A 0.20
13 101 111 A 0.20
13 102 222 B 0.27
13 102 111 A 0.19
13 102 222 A 0.24
13 101 111 A 0.24
13 101 222 B 0.30
13 102 222 B 0.25
14 101 111 B 0.21
14 102 222 B 0.25
14 102 222 B 0.28
14 101 222 B 0.23
14 102 111 A 0.21
14 101 111 A 0.19
14 101 222 B 0.26
15 102 222 A 0.25
15 102 111 A 0.21
15 101 222 B 0.28
15 102 222 A 0.22
15 101 111 A 0.23
15 102 222 A 0.23
15 101 222 B 0.27
15 102 222 B 0.23
15 101 222 B 0.27
15 102 222 B 0.28
16 101 222 A 0.28
16 101 222 B 0.27
16 101 111 A 0.20
16 101 222 B 0.34
16 101 222 B 0.27
16 102 111 A 0.24
16 102 111 A 0.21
16 102 111 B 0.17
17 102 111 A 0.20
17 101 111 A 0.20
17 102 111 A 0.17
17 101 222 A 0.29
17 102 222 B 0.27
17 101 222 B 0.27
17 102 111 B 0.22
17 102 222 B 0.28
17 101 111 A 0.20
17 101 111 A 0.20
17 101 111 B 0.20
17 101 222 B 0.30
17 101 111 A 0.22
18 101 222 B 0.28
18 101 111 B 0.26
18 101 111 A 0.18
18 102 111 B 0.21
18 101 222 A 0.22
19 101 111 B 0.21
19 102 111 A 0.21
19 101 222 B 0.26
19 101 111 A 0.22
19 102 111 A 0.20
--------------FF3F4C9902BE262E184667B2--
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._