[R] A question on glmnet analysis

khosoda at med.kobe-u.ac.jp khosoda at med.kobe-u.ac.jp
Mon Mar 28 14:58:13 CEST 2011


(11/03/27 22:49), KH wrote:
> (11/03/25 22:40), Nick Sabbe wrote:
>
>> 2. Which model, I mean lasso or elastic net, should be selected? and
>> why? Both models chose the same variables but different coefficient values.
>> You may want to read 'the elements of statistical learning' to find some
>> info on the advantages of ridge/lasso/elnet compared. Lasso should work fine
>> in this relatively low-dimensional setting, although it depends on the
>> correlation structure of your covariates.

I should have used vif from car package for logistic model.

library(car)
test3 <- glm(y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15, 
family="binomial", data=MyData)
vif(test3)
  x1        x2    x3    x4        x5       x6    x7             x8 x9 
x10   x11   x12   x13   x14   x15
1.339349  1.477299       1.292232       1.309631       1.375251 
1.192694       1.763012       2.358474  1.755591       1.281404 
1.229909       1.353517       1.304637       1.486188       1.428996

Anyway, multicollinearity is unlikely to be a problem.

KH

> I also checked correlation structure of my covariates.
>
> test<- lm(y ~ x15std)
> library(DAAG)
> vif(test)
> x15std1  x15std2  x15std3  x15std4  x15std5  x15std6  x15std7  x15std8
> x15std9 x15std10 x15std11 x15std12 x15std13 x15std14
>    1.2299   1.2880   1.1011   1.1559   1.3033   1.0774   1.5369   1.9604
>    1.4664   1.1754   1.1396   1.2683   1.1685   1.1667
> x15std15
>    1.5534
>
> Variance inflation are less than 5 suggesting that multicollinearity is
> unlikely to be a problem.
>
> Therefore, Lasso model should be selected?
>
> Thanks a lot in advance,
>
> KH



More information about the R-help mailing list