[R] glmnet

Wed Aug 10 09:00:59 CEST 2011

Hi Andra.

I wonder how you come about trying to use LASSO without knowing what lambda
is. I'd advise you to read up on it. In the help (?glmnet) you can find
several paper references, but for a more gentle introduction, you can read
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

In a nutshell, though: lambda is the parameter that balances the weight
given to the penalty. The bigger this one is, the more 'pressure' there is
on the coefficients to be small (or better yet: disappear).
The way you use LASSO is: you look at a reasonable set of lambda values
(this is e.g. done by glmnet), calculate some measure of success with each
lambda value (e.g.: misclassification, AUC,...), generally by using
crossvalidation (as is provided by cv.glmnet: read its help).

Having this measure of success (say the AUC) for each lambda in your
reasonable set allows you to pick the most optimal (lambda.min) or, to avoid
happenstance peaks, a more conservative and parsimonious one (lambda.1se),
after which you can rerun your lasso with this selected lambda on the full
dataset, to find the variables in your model.

Finally, to avoid downward bias, you could run a normal glm with only the
variables selected in the previous step.

Good luck!

Nick Sabbe
--
ping: nick.sabbe at ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Andra Isan
> Sent: woensdag 10 augustus 2011 5:59
> To: r-help at r-project.org
> Subject: [R] glmnet
> 
> Hi All,
> I have been trying to use glmnet package to do LASSO linear regression.
> my x data is a matrix n_row by n_col and y is a vector of size n_row
> corresponding to the vector data. The number of n_col is much more
> larger than the number of n_row. I do the following:
> fits = glmnet(x, y, family="multinomial")I have been following this
> article: http://cran.r-project.org/web/packages/glmnet/glmnet.pdfpage
> 8, but there are some unclear parts that I dont understand. The lambda
> variable only returns 100 and I exactly dont know what lambda
> represents. So, basically I would like to know how to get the
> coefficients weights and what exactly lambda is? how I can see the
> difference between predicted values and observed values?
> If there is a sample code that helps me to understand how to use these,
> that would be great.
> Thanks a lot,Andra
> 
> 	[[alternative HTML version deleted]]