[R-sig-ME] cross validation of glmmLasso

Mollie Brooks mo|||eebrook@ @end|ng |rom gm@||@com
Mon Dec 12 13:09:40 CET 2022


Following up with a reproducible example…
Both of the models below can be fit in glmmLasso, but only the one with a single RE (lm1) can be cross-validated with lmmen.

library(glmmLasso)
data("soccer")
library(lmmen)

soccer[,c(4,5,9:16)]<-scale(soccer[,c(4,5,9:16)],center=TRUE,scale=TRUE)
soccer<-data.frame(soccer)

lm1 <- glmmLasso(points ~ transfer.spendings + ave.unfair.score 
                 + ball.possession + tackles 
                 + ave.attend + sold.out, rnd = list(team=~1), 
                 lambda=50, data = soccer)

summary(lm1)

lm2 <- glmmLasso(points ~ transfer.spendings + ave.unfair.score 
                 + ball.possession + tackles 
                 + ave.attend + sold.out, rnd = list(team=~1, pos=~1), 
                 lambda=50, data = soccer)

summary(lm2)


cv.lm1 <- cv.glmmLasso(form.fixed= points ~ transfer.spendings + ave.unfair.score 
                    + ball.possession + tackles 
                    + ave.attend + sold.out, 
                    form.rnd = list(team=~1), 
                    lambda=seq(5,250, by=5), dat = soccer)

cv.lm2 <- cv.glmmLasso(form.fixed= points ~ transfer.spendings + ave.unfair.score 
                    + ball.possession + tackles 
                    + ave.attend + sold.out, 
                    form.rnd = list(team=~1, pos=~1), 
                    lambda=seq(5,250, by=5), dat = soccer)

This is the error (below). I was getting a similar error with cv.glmmLasso:: cv.glmmLasso until I made the change described in the pull request from my earlier email. For that package, it was to be due to assuming that q_start was a scalar (as in the case of a single RE).

> cv.lm2=cv.glmmLasso(form.fixed= points ~ transfer.spendings + ave.unfair.score 
+                     + ball.possession + tackles 
+                     + ave.attend + sold.out, 
+                     form.rnd = list(team=~1, pos=~1), 
+                     lambda=seq(5,250, by=5), dat = soccer)
Error in diag(diag(q_start), sum(s)) : 
  'nrow' or 'ncol' cannot be specified when 'x' is a matrix
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> traceback()
8: stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix")
7: diag(diag(q_start), sum(s))
6: est.glmmLasso.RE(fix = fix, rnd = rnd, data = data, lambda = lambda, 
       family = family, final.re = final.re, switch.NR = switch.NR, 
       control = control)
5: est.glmmLasso(fix, rnd, data = data, lambda = lambda, family = family, 
       switch.NR = switch.NR, final.re = final.re, control = control)
4: glmmLasso::glmmLasso(fix = as.formula(form.fixed), rnd = form.rnd, 
       data = dat, lambda = lambda[opt], switch.NR = FALSE, final.re = TRUE, 
       control = list(start = Delta.start[opt, ], q_start = Q.start.base))
3: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
2: suppressWarnings({
       final <- glmmLasso::glmmLasso(fix = as.formula(form.fixed), 
           rnd = form.rnd, data = dat, lambda = lambda[opt], switch.NR = FALSE, 
           final.re = TRUE, control = list(start = Delta.start[opt, 
               ], q_start = Q.start.base))
       final
   })
1: cv.glmmLasso(form.fixed = points ~ transfer.spendings + ave.unfair.score + 
       ball.possession + tackles + ave.attend + sold.out, form.rnd = list(team = ~1, 
       pos = ~1), lambda = seq(5, 250, by = 5), dat = soccer)


Cheers,
Mollie

> On 12 Dec 2022, at 11.50, Mollie Brooks <mollieebrooks using gmail.com> wrote:
> 
> I’m interested in doing cross validation on GLMMs fit with LASSO. I found two functions for doing this: lmmen::cv.glmmLasso and cv.glmmLasso::cv.glmmLasso. With a small amount of digging, it looks like lmmen has been used more since it was on CRAN in the past and it shows up in a thesis and a working paper on Google scholar.
> 
> Both packages only work with a single random effect and I need 2 RE for my data set. I managed to fix that problem in the cv.glmmLasso package (https://github.com/thepira/cv.glmmLasso/pull/18 <https://github.com/thepira/cv.glmmLasso/pull/18>). Making lmmen work with multiple random effects is a little harder.
> 
> Another problem with lmmen is that the example from the helpfile isn’t working for me. So I’m not sure I should put time into making it work with multiple random effects.
> > tmp=cv.glmmLasso(initialize_example(seed=1))
> Error in rep(0, d.size) : invalid 'times' argument
> In addition: Warning message:
> In cv.glmmLasso(initialize_example(seed = 1)) : NAs introduced by coercion 
> 
> I’m wondering if anyone else has already been down this rabbit hole and can offer advice. Is there another package (or random code lying around) for doing cross validation on GLMMs with LASSO that is more thoroughly tested and currently in use?
> 
> Cheers,
> Mollie
> 
> 


	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list