[R] glmnet with binary logistic regression

fongchun fongchunchan at gmail.com
Sat Jul 23 01:51:32 CEST 2011

Hi all,

I am using the glmnet R package to run LASSO with binary logistic
regression.  I have over 290 samples with outcome data (0 for alive, 1 for
dead) and over 230 predictor variables.  I currently using LASSO to reduce
the number of predictor variables.

I am using the cv.glmnet function to do 10-fold cross validation on a
sequence of lambda values which I let glmnet determine.  I then take the
optimal lambda value (lambda.1se) which I then use to predict on an
independent cohort.  

What I am finding is that this optimal lambda value fluctuates everytime I
run glmnet with LASSO.  It deviates quite a bit such that each time I
generate an ROC curve for my validation cohort, I get AUC values which
deviate a bit.  Does anyone know why there is such a fluctuation in the
generation of an optimal lambda?  I am thinking it might be due to the 10
fold cross validation step the training set is not being split well know to
have enough alive and dead cases?  Thoughts?

View this message in context: http://r.789695.n4.nabble.com/glmnet-with-binary-logistic-regression-tp3688126p3688126.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list