[R] Logistic Regression - Variable Selection Methods With Prediction
RAJ
dheerajathreya at gmail.com
Wed Oct 26 01:54:17 CEST 2011
Hello,
I am pretty new to R, I have always used SAS and SAS products. My
target variable is binary ('Y' and 'N') and i have about 14 predictor
variables. My goal is to compare different variable selection methods
like Forward, Backward, All possible subsests. I am using
misclassification rate to pick the winner method.
This is what i have as of now,
Reg <- glm (Graduation ~., DFtrain,family=binomial(link="logit"))
step <- extractAIC(Reg, direction="forward")
pred <- predict(Reg, DFtest,type="response")
mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
This program actually works but I needed to check to make sure am
doing this right. Also, I am getting the same misclassification rates
for all different methods.
I also tried to use
Reg <- leaps(Graduation ~., DFtrain)
pred <- predict(Reg, DFtest,type="response")
mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
#print(summary(mis))
which doesnt work
and
Reg <- regsubsets(Graduation ~., DFtrain)
pred <- predict(Reg, DFtest,type="response")
mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
#print(summary(mis))
The Regsubsets will work but the 'predict' function does not work with
it. Is there any other way to do predictions when using regsubsets
Any help is appreciated.
Thanks,
More information about the R-help
mailing list