[R] Cross-validation for logistic regression with lasso2
francogrex
francogrex at mail.com
Fri May 18 13:44:35 CEST 2007
Hello, I am trying to shrink the coefficients of a logistic regression for a
sparse dataset, I am using the lasso (lasso2) and I am trying to determine
the shrinkinage factor by cross-validation. I would like please some of the
experts here to tell me whether i'm doing it correctly or not. Below is my
dataset and the functions I use
w=
a b c d e P A
0 0 0 0 0 1 879
1 0 0 0 0 1 3
0 1 0 0 0 7 7
0 0 1 0 0 230 2
0 0 0 1 0 450 7
0 0 0 0 1 4
#The GLM output shows that the coefficients c and d are larger than 10:
resp=cbind(w$P,w$A)
summary(glm(resp~a+b+c+d+e,data=w,family=binomial))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.779 1.001 -6.775 1.24e-11 ***
a 5.680 1.528 3.718 0.000201 ***
b 6.779 1.134 5.976 2.29e-09 ***
c 11.524 1.227 9.392 < 2e-16 ***
d 10.942 1.071 10.220 < 2e-16 ***
e 3.688 1.124 3.282 0.001031 **
#so I wrote this below using the lasso2 package to determine the best
shrinkage factor using the gcv cross-validation:
for (i in seq(1,40,1)) {
glmba=gl1ce(resp~a+b+c+d+e, data = w, family = binomial(),bound=i)
ecco=round(gcv(glmba,type="Tibshirani",gen.inverse.diag =1e11),digits=3)
print(ecco)
}
#and it gives me 21 with the lowest gcv.
#then I determine the shrunken coefficients:
>gl1ce( resp ~ a + b + c + d + e, data = w, family = binomial(), bound =
21)
Coefficients:
(Intercept) a b c d
e
-4.749816 2.776215 4.342661 8.956583 8.661593 1.264660
Family:
Family: binomial
Link function: logit
The absolute L1 bound was : 21
The Lagrangian for the bound is : 1.843283
Thanks
--
View this message in context: http://www.nabble.com/Cross-validation-for-logistic-regression-with-lasso2-tf3777173.html#a10680591
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list