[R] How to write a loop in R to select multiple regression model and validate it ?

beginner paxkn at nottingham.ac.uk
Wed Jun 5 01:45:31 CEST 2013


I would like to run a loop in R. I have never done this before, so I would be
very grateful for your help !

1. I have a sample set: 25 objects. I would like to draw 1 object from it
and use it as a test set for my future external validation. The remaining 24
objects I would like to use as a training set (to select a model). I would
like to repeat this process until all 25 objects are used as a test set. 

2. For each of the training sets I would like to run the following code:


library(leaps)
forward <- regsubsets(Y ~.,data = training, method = "forward", nbest=1) 
backward <- regsubsets(Y ~.,data = training, method = "backward", nbest=1)
stepwise <- regsubsets(Y ~., data = training, method = "seqrep", nbest=1)
exhaustive <- regsubsets(Y ~.,data = training, method = "forward", nbest=1)
summary(forward)
summary(backward)
summary(stepwise)
summary(exhaustive)

I would like R programme to select the best model (with the highest adjusted
R2) using each of the selection methods, so there are 4 final best models
(e.g. the best model selected with forward selection, the best model
selected with backward selection and so on...). 

 
Afterwards I would like to perform internal cross validation of all 4
selected models and choose 1 out of 4 which has the lowest average mean
squared error (MSE). I used to do it using the code below:

library(DAAG)
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X3))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X4))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X3+X4+X5))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X4+X5+X7))

For the best selected model (the lowest MSE) I would like to perform an
external validation on 1 object left on the site at the beginning of the
study (please refer to point 1.).

3. And loop again using different training and test set ....


I hope that you could help me with this. 

If you have any suggestions how to select the best model and perform
validation more efficiently, I would be happy to hear about that.

Thank you !



--
View this message in context: http://r.789695.n4.nabble.com/How-to-write-a-loop-in-R-to-select-multiple-regression-model-and-validate-it-tp4668669.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list