[R-pkgs] New glmnet package on CRAN

Trevor Hastie hastie at stanford.edu
Mon Jun 2 20:08:16 CEST 2008


glmnet is a package that fits the regularization path for linear, two- 
and multi-class logistic regression
models with "elastic net" regularization (tunable mixture of L1 and L2 
penalties).
glmnet uses pathwise coordinate descent, and is very fast.

Some of the features of glmnet:

* by default it computes the path at 100 uniformly spaced (on the log 
scale) values of the regularization parameter
* glmnet appears to be faster than any of the packages that are freely 
available, in some cases by two orders of magnitude.
* recognizes and exploits sparse input matrices (ala Matrix package). 
Coefficient matrices are output in sparse matrix representation.
* penalty is (1-a)*||\beta||_2^2 +a*||beta||_1  where a is between 0 and 
1;  a=0 is the Lasso penalty, a=1 is the ridge penalty.
   For many correlated predictors, a=.95 or thereabouts improves the 
performance of the lasso.
* convenient predict, plot, print, and coef methods
* variable-wise penalty modulation allows each variable to be penalized 
by a scalable amount; if zero that variable always enters
* glmnet uses a symmetric parametrization for multinomial, with 
constraints enforced by the penalization.

Other families such as poisson might appear in later versions of glmnet.

Examples of glmnet speed trials:
 
Newsgroup data: N=11,000, p=4 Million, two class logistic. 100 values 
along lasso path.   Time = 2mins
14 Class cancer data: N=144, p=16K, 14 class multinomial, 100 values 
along lasso path. Time = 30secs

Authors: Jerome Friedman, Trevor Hastie, Rob Tibshirani.

See our paper http://www-stat.stanford.edu/~hastie/Papers/glmnet.pdf for 
implementation details,
and comparisons with other related software.

-- 
--------------------------------------------------------------------
  Trevor Hastie                                  hastie at stanford.edu
  Professor & Chair, Department of Statistics, Stanford University
  Phone: (650) 725-2231 (Statistics)	         Fax: (650) 725-8977
	 (650) 498-5233 (Biostatistics)		 Fax: (650) 725-6951
  URL: http://www-stat.stanford.edu/~hastie
  address: room 104, Department of Statistics, Sequoia Hall
	          390 Serra Mall, Stanford University, CA 94305-4065




More information about the R-packages mailing list