[R] glmmLasso with interactions errors
Ben Bolker
bbolker at gmail.com
Fri Jul 15 16:23:51 CEST 2016
Cade, Brian <cadeb <at> usgs.gov> writes:
>
> It has never been obvious to me that the lasso approach can handle
> interactions among predictor variables well at all.
> I'ld be curious to see
> what others think and what you learn.
>
> Brian
>
For what it's worth I think lasso *does* handle interactions
reasonably (although I forget where I read that) -- there is a
newer "hierarchical lasso" that tries to deal with marginality
concerns more carefully.
Related questions asked on StackOverflow:
http://stackoverflow.com/questions/37910042/glmmlasso-warning-messages/
37922918#37922918
(warning, broken URL)
My answer (in comments) there was
my guess is that you're going to have to build your own model
matrix/dummy variables; I think that as.factor() in formulas is
treated specially, so including the interaction term will probably
just confuse it. (It would be worth trying as.factor(Novelty:ROI) - I
doubt it'll work but if it does it would be the easiest way forward.)
>
> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp <at> uwm.edu> wrote:
[snip]
> >
> > An abbreviated version of my dataset is here:
> >
> > https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
> >
[snip snip]
> > Before glmmLasso I am running:
> >
> > KNov$Subject <- factor(KNov$Subject)
> >
> > to ensure the subject ID is not treated as a continuous variable.
> >
> > If I run:
> >
> > glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
> > lambda=10)
> > summary(glm1)
> >
> > I don't get any warning messages, but the output contains b estimates
> > only, no SE or p-values.
> >
> > If I try to include a 3-way interaction, such as:
> >
> > glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm2)
> >
> > I get the warnings:
> >
> > Warning messages:
> > 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
> > data length is not a multiple of split variable
> > 2: In lambda_vec * sqrt(block2) :
> > longer object length is not a multiple of shorter object length
> >
> > And again, I do get parameter estimates, and no SE or p-values.
> >
> > If I include my continuous variable in any interaction, such as:
> >
> > glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm3)
> >
> > I get the error message:
> >
> > Error in rep(control$index[i], length.fac) : invalid 'times' argument
> >
> > and no output.
> >
> > If anyone has an input as to (1) why I am not getting SE or p-values
> > in my outputs (2) the meaning of there warnings I get when I include a
> > 3-way variable, and if they are something to worry about, how to fix
> > them and (3) how to fix the error message I get when I include my
> > continuous factor in an interatction, I would be very appreciative.
[snip snip snip]
More information about the R-help
mailing list