[R] Validation / Training - test data
Sam
Sam_Smith at me.com
Wed Sep 29 10:25:58 CEST 2010
Dear List,
I have developed two models i want to use to predict a response, one with a binary response and one with a ordinal response.
My original plan was to divide the data into test (300 entries) and training (1000 entries) and check the power of the model by looking at the % correct predictions. However i have been told my a colleague that 1300 entries is far too little to partition the data set and i should use the whole data set, and determine the power of the model with scores such as c-value and Brier score and use bootstrapping.
I understand how to bootstrap in R however i have never used it on predicted values.
My questions are -
1. Using the boot() command how do i use this to test the power of my predictive model?
2. Is it possible to bootstrap brier score or is this not necessary?
3. ( This is a separate point i am struggling with, i thought i would include it here instead of posting again!) I have selected the most likely model with AIC criteria from a set of candidate GLMM models, however as GLMM has no predict function i have used the best model and excluded the random effects and ran it as a glm and used the predict function from here - is this OK?
Thanks
Sam
More information about the R-help
mailing list