[R] library(rpart) or library(tree)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Dec 19 23:09:45 CET 2007
You appear to have fitted a regression tree, which does not seem to be
what your interpretation of 'pnV22' requires.
I have little idea what you actually did, but am confident that it is not
what you claim you did.
Also, note fortune("dog"):
Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
-- Barry Rowlingson
R-help (October 2004)
On Wed, 19 Dec 2007, Ingo Holz wrote:
> Hi,
>
> I have a problem with library (rpart) (and/or library(tree)).
>
> I use a data.frame with variables
> "pnV22" (observation: 1, 0 or yes, no)
> "JTemp" (mean temperature)
> "SNied" (summer rain)
>
> I used function "rpart" to build a model:
>
> library(rpart)
> attach(data.frame)
> result <- rpart(pnV22 ~ JTemp + SNied)
>
> I got the following tree:
I don't believe that: how could rpart know about 'punkte'?
> n=55518 (50 observations deleted due to missingness)
>
> node), split, n, deviance, yval
> * denotes terminal node
>
> 1) root 55518 668.744500 0.0121942400
> 2) punkte[["JTemp"]]< 10.35 51251 18.992960 0.0003707245 *
> 3) punkte[["JTemp"]]>=10.35 4267 556.532000 0.1542067000
> 6) punkte[["SNied"]]>=450 3136 291.318600 0.1036352000 *
> 7) punkte[["SNied"]]< 450 1131 234.954900 0.2944297000
> 14) punkte[["JTemp"]]>=10.55 723 113.502100 0.1950207000 *
> 15) punkte[["JTemp"]]< 10.55 408 101.647100 0.4705882000
> 30) punkte[["JTemp"]]< 10.45 48 4.479167 0.1041667000 *
> 31) punkte[["JTemp"]]>=10.45 360 89.863890 0.5194444000 *
>
> I constructed a simple new.data.frame:
>
> new.data.fame <- data.frame
> new.data.frame[,"JTemp"] <- 10.5
> new.data.frame[,"SNied"] <- 430
>
> Than I used predict() to predict values for "pnV22" in the following way:
>
> pred <- predict(result, data.frame)
> pred2 <- predict(result, new.data.frame)
It is not finding the new values from the new data frame: they do not have
names like 'punkte[["JTemp"]]'.
> The results are the same, which I checked by ploting the values of pred and pred2 and by
>
> table(pred ==pred2) which is true for all values.
>
> Looking at the tree I would expect that pred2 has the same high value for all elements of the
> vector. Did I make a mistake?
>
> Thanks, Ingo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list