[R] Bootstrap Methods for Confidence Intervals -- glmnet

Thu May 12 16:49:52 CEST 2016

Dear All,
Please have a look at the code at the end of the email.
It is just an example of regression based on glmnet with some
artificial data.
My question is how I can evaluate the uncertainty of the prediction
yhat.

It looks like there are some reasons for not providing a standard
error estimate, see e.g.

http://stackoverflow.com/questions/12937331/how-to-get-statistical-summary-information-from-glmnet-model
and
https://www.reddit.com/r/statistics/comments/1vg8k0/standard_errors_in_glmnet/

However, from what I read in this thesis

https://air.unimi.it/retrieve/handle/2434/153099/133417/phd_unimi_R07738.pdf

(see sections 3.2 and 3.3)

and in the quoted papers

http://www.stat.cmu.edu/~fienberg/Statistics36-756/Efron1979.pdf
and
http://www.ams.org/journals/proc/2010-138-12/S0002-9939-2010-10474-4/S0002-9939-2010-10474-4.pdf

there are some bootstrap methods that are quite general and applicable
well beyond the case of glmnet.
Is there anything already implemented to help me out? Is anybody aware
of this?
Cheers

Lorenzo

#########################################################################
#########################################################################
#########################################################################
#########################################################################
#########################################################################

library(glmnet)

# Generate data
set.seed(19875)  # Set seed for reproducibility
n <- 1000  # Number of observations
p <- 5000  # Number of predictors included in model
real_p <- 15  # Number of true predictors
x <- matrix(rnorm(n*p), nrow=n, ncol=p)
y <- apply(x[,1:real_p], 1, sum) + rnorm(n)

# Split data into train (2/3) and test (1/3) sets
train_rows <- sample(1:n, .66*n)
x.train <- x[train_rows, ]
x.test <- x[-train_rows, ]

y.train <- y[train_rows]
y.test <- y[-train_rows]

fit.elnet <- glmnet(x.train, y.train, family="gaussian", alpha=.5)

yhat <- predict(fit.elnet, s=fit.elnet$lambda, newx=x.test)