[Rd] ordered factors in tree package - bug? (PR#1025)

murphyk@cs.berkeley.edu murphyk@cs.berkeley.edu
Sat, 14 Jul 2001 00:10:39 +0200 (MET DST)


This is a multi-part message in MIME format.
--------------FF3F4C9902BE262E184667B2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I am new to R, and didn't know which list to send this to, since it is a
bug report about a package, not about core R...

I have created a regression tree using 4 predictors: 3 are unordered
(binary) predictors, and the last is a date (integer), which I declare
to be an ordered factor. However, the tree treats the date as if it were
un-ordered, splitting into non-consecutive subsets.

My code

dat <- read.table("/home/cs/murphyk/R/Eugene/102.dat", header=TRUE)
dat$machine <- factor(dat$machine)
dat$TIM <- factor(dat$TIM)
dat$lid <- factor(dat$lid)
#dat$date <- factor(dat$date, ordered=TRUE)
dat$date <- ordered(dat$date)
tr <- tree(TRES ~ ., dat)

produces

 1) root 100 0.149300 0.2345  
   2) TIM: 111 47 0.020650 0.2011  
     4) date: 9,11 11 0.002673 0.1845 *
     5) date: 6,7,8,10,13,14,15,16,17,18,19 36 0.014060 0.2061 *
   3) TIM: 222 53 0.029490 0.2642  
     6) date: 6,7,11,12,14,15,18,19 28 0.010670 0.2539  
      12) lid: A 6 0.001350 0.2350 *
      13) lid: B 22 0.006582 0.2591 *
     7) date: 8,9,10,13,16,17 25 0.012620 0.2756  
      14) machine: 101 14 0.005750 0.2850 *
      15) machine: 102 11 0.004055 0.2636 *

and yet

> dat$date
  [1] 6  6  6  6  6  7  7  7  7  7  7  7  8  8  8  9  9  9  9  9  9  9 
9  9  10
 [26] 10 10 10 10 10 11 11 11 11 11 11 11 11 12 12 12 13 13 13 13 13 13
13 13 13
 [51] 13 13 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 16 16 16
16 16 16
 [76] 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 19 19
19 19 19
Levels:  6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < 14 < 15 < 16 < 17 < 18 < 19 
> is.ordered(dat$date)
[1] TRUE



Also, how do I deal with dates of the form dd/mm/yy, instead of just
integers? (In the above file, I used perl to extract the day, since I
new month and year were constant.)

I have attached the text file 102.dat, so you can easily reproduce the
above bug. I am using R 1.2.3 on linux. For future reference, is it
considered bad form to send attachments?


Kevin
--------------FF3F4C9902BE262E184667B2
Content-Type: text/plain; charset=us-ascii;
 name="102.dat"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="102.dat"

date machine TIM lid TRES
6 101 222 B 0.24
6 102 222 B 0.26
6 102 111 B 0.20
6 102 222 B 0.24
6 101 111 A 0.20
7 102 111 A 0.20
7 101 111 B 0.25
7 102 222 B 0.27
7 101 111 A 0.18
7 102 111 A 0.18
7 101 222 B 0.28
7 102 222 B 0.24
8 102 111 B 0.21
8 101 222 B 0.25
8 101 222 B 0.29
9 101 222 B 0.28
9 101 111 A 0.17
9 102 222 A 0.27
9 101 111 A 0.21
9 102 111 A 0.16
9 101 111 A 0.21
9 102 111 A 0.18
9 101 111 A 0.18
9 102 111 A 0.18
10 102 222 B 0.30
10 101 222 B 0.29
10 102 222 A 0.25
10 101 222 B 0.29
10 102 222 B 0.23
10 102 111 A 0.20
11 102 222 B 0.24
11 101 111 A 0.17
11 102 111 A 0.18
11 102 222 A 0.26
11 102 111 A 0.20
11 102 222 B 0.28
11 101 111 A 0.19
11 102 222 A 0.23
12 101 222 B 0.26
12 101 222 B 0.24
12 102 222 B 0.26
13 101 222 B 0.27
13 102 222 B 0.28
13 102 222 B 0.26
13 101 111 A 0.20
13 101 111 A 0.20
13 102 222 B 0.27
13 102 111 A 0.19
13 102 222 A 0.24
13 101 111 A 0.24
13 101 222 B 0.30
13 102 222 B 0.25
14 101 111 B 0.21
14 102 222 B 0.25
14 102 222 B 0.28
14 101 222 B 0.23
14 102 111 A 0.21
14 101 111 A 0.19
14 101 222 B 0.26
15 102 222 A 0.25
15 102 111 A 0.21
15 101 222 B 0.28
15 102 222 A 0.22
15 101 111 A 0.23
15 102 222 A 0.23
15 101 222 B 0.27
15 102 222 B 0.23
15 101 222 B 0.27
15 102 222 B 0.28
16 101 222 A 0.28
16 101 222 B 0.27
16 101 111 A 0.20
16 101 222 B 0.34
16 101 222 B 0.27
16 102 111 A 0.24
16 102 111 A 0.21
16 102 111 B 0.17
17 102 111 A 0.20
17 101 111 A 0.20
17 102 111 A 0.17
17 101 222 A 0.29
17 102 222 B 0.27
17 101 222 B 0.27
17 102 111 B 0.22
17 102 222 B 0.28
17 101 111 A 0.20
17 101 111 A 0.20
17 101 111 B 0.20
17 101 222 B 0.30
17 101 111 A 0.22
18 101 222 B 0.28
18 101 111 B 0.26
18 101 111 A 0.18
18 102 111 B 0.21
18 101 222 A 0.22
19 101 111 B 0.21
19 102 111 A 0.21
19 101 222 B 0.26
19 101 111 A 0.22
19 102 111 A 0.20

--------------FF3F4C9902BE262E184667B2--


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._