[R] glmpath in R

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Apr 7 16:48:25 CEST 2010


Comments inline:

> Thanks very much for your reply. My main objective in building the model is to
> determine the relative strength of the variables in predicting my
> presence/absence data. It's really an exploratory method, I'm  interested in
> whether the associations that have been observed out in the field come out in
> the model. I'm also using rpart to build a classification tree to get a sense of
> any interactions.
> I was planning to use cross-validation to identify a value of lambda that gives
> minimum mean cv error and the largest value of lambda such that error is within
> 1 SE of the minimum.


> I'm not entirely sure how to proceed in building the full
> model using this value of lambda. At this point do I simply use predict.glmpath
> (or predict.glmnet) setting the value of "s" to lambda and return the
> coefficients?

That's what I was suggesting. When I said "full model" I meant that
I'd use all of my data to build the model by using the lambda I
determined during the CV (ie. no more holding out of data).

> I plan to validate the chosen coefficient estimates through a
> bootstrap analysis.


> Beyond conducting this "smoke test", I'm wondering how I should assess the
> resulting model. Can I assess the fit and predictive accuracy of a glmnet object?

I'm not sure I get the question? You would assess its predictive
accuracy the same way you presumably did during the CV: ie, calling
`predict(model, ... s=YOUR_LAMBDA)`.

You could use some data that has been completely held out during your
CV that you're model has never seen before for the estimate, or avg.
the accuracy from each fold?

I mean, you never get an absolute answer to how accurate your model is, right?

> Thanks again for your help. I am also planning on discussing my work with a
> professor in statistics. I appreciate the insight though as I attempt to wrap my
> head around these methods.

Great. Please let us know what you come up with ... perhaps others
will find it helpful, too.


Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the R-help mailing list