[R] Behaviour of dfmax in glmnet
@bh|@hek@gho@e@82 @end|ng |rom gm@||@com
Wed Feb 27 23:56:11 CET 2019
I am new to <i>glmnet</i>, so I do not yet understand fully what the various
parameters do. I am trying to build a multinomial classifier which restricts
the number of features used in the model. From reading the docs and some
answers on this forum, I understand <i>dfmax</i> is the way to do it. I
around with it a bit; I have a couple of questions and would appreciate some
For a particular dataset, I want to restrict the number of features to 3;
the original data has 126 features. Here's what I run:
fit<-glmnet(data.matrix(X), data.matrix(y), family='multinomial', dfmax=3)
This is the value of <i>d</i> (inserting a screenshot since the table
disturbed by the formatting):
My questions about the output:
 I see multiple values of <i>lambda</i> in there; it looks like glmnet
to fit lambdas that gets the number of terms close to dfmax=3. So its less
like the LARs algorithm (in the sense that we don't move stagewise by adding
variables) and more about getting the right lambdas for regularization that
lead to the intended dfmax. Is this right?
 I'm guessing alpha plays a role in how close we can get to dfmax. At
alpha=1, where we're doing lasso, and so its easier to get close to dfmax,
compared to when alpha=0 and we're doing ridge. Is this understanding
 A "neighborhood" of dfmax is the best we can do it'd seem. Or am I
missing a parameter that gets me to the model with the exact dfmax (fyi:
alpha=1 doesn't seem to get me to the precise number of non zero terms
either, at least on this dataset).
 what does pmax do?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 54147 bytes
Desc: not available
More information about the R-help