Hi, I have a problem with library (rpart) (and/or library(tree)).
I use a data.frame with variables "pnV22" (observation: 1, 0 or yes, no) "JTemp" (mean temperature) "SNied" (summer rain) I used function "rpart" to build a model: library(rpart) attach(data.frame) result <- rpart(pnV22 ~ JTemp + SNied) I got the following tree: n=55518 (50 observations deleted due to missingness) node), split, n, deviance, yval * denotes terminal node 1) root 55518 668.744500 0.0121942400 2) punkte[["JTemp"]]< 10.35 51251 18.992960 0.0003707245 * 3) punkte[["JTemp"]]>=10.35 4267 556.532000 0.1542067000 6) punkte[["SNied"]]>=450 3136 291.318600 0.1036352000 * 7) punkte[["SNied"]]< 450 1131 234.954900 0.2944297000 14) punkte[["JTemp"]]>=10.55 723 113.502100 0.1950207000 * 15) punkte[["JTemp"]]< 10.55 408 101.647100 0.4705882000 30) punkte[["JTemp"]]< 10.45 48 4.479167 0.1041667000 * 31) punkte[["JTemp"]]>=10.45 360 89.863890 0.5194444000 * I constructed a simple new.data.frame: new.data.fame <- data.frame new.data.frame[,"JTemp"] <- 10.5 new.data.frame[,"SNied"] <- 430 Than I used predict() to predict values for "pnV22" in the following way: pred <- predict(result, data.frame) pred2 <- predict(result, new.data.frame) The results are the same, which I checked by ploting the values of pred and pred2 and by table(pred ==pred2) which is true for all values. Looking at the tree I would expect that pred2 has the same high value for all elements of the vector. Did I make a mistake? Thanks, Ingo ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.