[R] randomForest out of bag prediction

Bert Gunter bgunter@4567 @ending from gm@il@com
Sat Jan 12 19:16:21 CET 2019


Off topic.
But see here:
https://stats.stackexchange.com/questions/61405/random-forest-and-prediction

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jan 12, 2019 at 9:56 AM Witold E Wolski <wewolski using gmail.com> wrote:

> Hello,
>
> I am just not sure what the predict.RandomForest function is doing...
> I confused.
>
> I would expect the predictions for these 2 function calls to predict the
> same:
> ```{r}
> diachp.rf <- randomForest(quality~.,data=data,ntree=50, importance=TRUE)
>
> ypred_oob <- predict(diachp.rf)
> dataX <- data %>% select(-quality) # remove response.
> ypred <- predict( diachp.rf, dataX )
>
> ypred_oob == ypred
> ```
> These are both out of bag predictions but ypred and ypred_oob are
> actually they are very different.
>
> > table(ypred_oob , data$quality)
>
> ypred_oob    0    1
>         0 1324  346
>         1  493 2837
> > table(ypred , data$quality)
>
> ypred    0    1
>     0 1817    0
>     1    0 3183
>
> What I find even more disturbing is that 100% accuracy for ypred.
> Would you agree that this is rather unexpected?
>
> regards
> Witek
> --
> Witold Eryk Wolski
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list