[R] glmmLasso with interactions errors

Ben Bolker bbolker at gmail.com
Fri Jul 15 16:23:51 CEST 2016


Cade, Brian <cadeb <at> usgs.gov> writes:

> 
> It has never been obvious to me that the lasso approach can handle
> interactions among predictor variables well at all. 
> I'ld be curious to see
> what others think and what you learn.
> 
> Brian
> 

  For what it's worth I think lasso *does* handle interactions
reasonably (although I forget where I read that) -- there is a
newer "hierarchical lasso" that tries to deal with marginality
concerns more carefully.

  Related questions asked on StackOverflow:

http://stackoverflow.com/questions/37910042/glmmlasso-warning-messages/
  37922918#37922918
(warning, broken URL)

My answer (in comments) there was

my guess is that you're going to have to build your own model
matrix/dummy variables; I think that as.factor() in formulas is
treated specially, so including the interaction term will probably
just confuse it. (It would be worth trying as.factor(Novelty:ROI) - I
doubt it'll work but if it does it would be the easiest way forward.)


> 
> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp <at> uwm.edu> wrote:

[snip]

> >
> > An abbreviated version of my dataset is here:
> >
> > https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
> >

[snip snip]

> > Before glmmLasso I am running:
> >
> > KNov$Subject <- factor(KNov$Subject)
> >
> > to ensure the subject ID is not treated as a continuous variable.
> >
> > If I run:
> >
> > glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
> > lambda=10)
> > summary(glm1)
> >
> > I don't get any warning messages, but the output contains b estimates
> > only, no SE or p-values.
> >
> > If I try to include a 3-way interaction, such as:
> >
> > glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm2)
> >
> > I get the warnings:
> >
> > Warning messages:
> > 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
> >   data length is not a multiple of split variable
> > 2: In lambda_vec * sqrt(block2) :
> >   longer object length is not a multiple of shorter object length
> >
> > And again, I do get parameter estimates, and no SE or p-values.
> >
> > If I include my continuous variable in any interaction, such as:
> >
> > glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm3)
> >
> > I get the error message:
> >
> > Error in rep(control$index[i], length.fac) : invalid 'times' argument
> >
> > and no output.
> >
> > If anyone has an input as to (1) why I am not getting SE or p-values
> > in my outputs (2) the meaning of there warnings I get when I include a
> > 3-way variable, and if they are something to worry about, how to fix
> > them and (3) how to fix the error message I get when I include my
> > continuous factor in an interatction, I would be very appreciative.


 [snip snip snip]



More information about the R-help mailing list