[R] library(rpart) or library(tree)
    Ingo Holz 
    Ingo.Holz at uni-hohenheim.de
       
    Wed Dec 19 17:14:45 CET 2007
    
    
  
Hi,
 I have a problem with library (rpart) (and/or library(tree)).
 I use a data.frame with variables
"pnV22" (observation: 1, 0 or yes, no)
"JTemp" (mean temperature)
"SNied"  (summer rain)
 I used function "rpart" to build a model:
	library(rpart)
	attach(data.frame)
	result <- rpart(pnV22 ~ JTemp + SNied)
 I got the following tree:
  n=55518 (50 observations deleted due to missingness)
node), split, n, deviance, yval
      * denotes terminal node
 1) root 55518 668.744500 0.0121942400  
   2) punkte[["JTemp"]]< 10.35 51251  18.992960 0.0003707245 *
   3) punkte[["JTemp"]]>=10.35 4267 556.532000 0.1542067000  
     6) punkte[["SNied"]]>=450 3136 291.318600 0.1036352000 *
     7) punkte[["SNied"]]< 450 1131 234.954900 0.2944297000  
      14) punkte[["JTemp"]]>=10.55 723 113.502100 0.1950207000 *
      15) punkte[["JTemp"]]< 10.55 408 101.647100 0.4705882000  
        30) punkte[["JTemp"]]< 10.45 48   4.479167 0.1041667000 *
        31) punkte[["JTemp"]]>=10.45 360  89.863890 0.5194444000 *
 I constructed a simple new.data.frame:
     new.data.fame <- data.frame
     new.data.frame[,"JTemp"] <- 10.5
     new.data.frame[,"SNied"] <- 430
Than I used predict() to predict values for "pnV22" in the following way:
    pred <- predict(result, data.frame)
    pred2 <- predict(result, new.data.frame)
The results are the same, which I checked by ploting the values of pred and pred2 and by
   table(pred ==pred2)  which is true for all values.
Looking at the tree I would expect that pred2 has the same high value for all elements of the 
vector. Did I make a mistake?
Thanks, Ingo
    
    
More information about the R-help
mailing list