[R-sig-ME] multilevel model for split plot design

Thu Aug 25 19:20:41 CEST 2022

Hello,

I am trying to figure out the correct specification of a multilevel 
model for a split plot design experiment (using dummy data). Here is a 
description of a simplified version of the design (code below): Each 
plot (i) has 2, 4, 8 or 16 plant species. In every plot, a subplot has 
remained as is (c), the other half received the treatment (t). Our 
hypothesis is that the effect of the treatment (z) on the response (y) 
in a plot will differ depending on species richness (x). I would 
therefore specify the treatment effect to vary by plot. When I do this 
using glmmTMB (or brms) the model fits without an issue. However lme4 
gives an error:

"number of observations (=160) <= number of random effects (=160) for 
term (1 + z | i); the random-effects parameters and the residual 
variance (or scale parameter) are probably unidentifiable"

To see if I understand that error, I tried changing the design to 3 
control and 3 treatment subplots per plot. (Which is not really an 
option for the experiment.)

Sure enough, lme4 fits the model without errors (but I do get a singular 
fit warning).

So my question is, is the random effect specification appropriate for 
the first version of the data? And if yes/no what should I make of the 
different behaviour of the two packages? (I have also tried regressing 
the c-t difference directly on richness, with results consistent with 
the mixed model; which makes me think that the model specification is fine?)

Thank you,

Angelos Amyntas

Here is a reprex:

library(tidyverse)
library(glmmTMB)
library(lme4)

set.seed(321)

d=data.frame(x = rep(c(2,4,8,16), each = 40),
              y = NA,
              z = rep(c("c","t"), 80),
              i = as.factor(rep(1:80, each = 2)))

d$y[d$z=="c"] = 5 + .5*d$x[d$z=="c"] + rnorm(80)
d$y[d$z=="t"] = 4 + .25*d$x[d$z=="t"] + rnorm(80)

d$x=(d$x - mean(d$x))/sd(d$x)

m=glmmTMB(y ~ 1 + x + z + z:x + (1 + z|i), data = d)
summary(m)

m=lmer(y~1 + x + z + z:x + (1 + z|i), data = d)

# modeling the difference directly
d. = d %>%
   group_by(i) %>%
   summarise(x=first(x),
             diff = y[z=="t"] - y[z=="c"])
m=glmmTMB(diff ~ 1 + x, data = d.)
summary(m)

# trying 3 replicates in each subplot
d=data.frame(x = rep(c(2,4,8,16), each = 40*3),
              y = NA,
              z = rep(rep(c("c","t"),each = 3), 80),
              z_i = rep(c("c1","c2","c3","t1","t2","t3"), 80),
              i = as.factor(rep(1:80, each = 2*3)))

d$y[d$z=="c"] = 5 + .5*d$x[d$z=="c"] + rnorm(240)
d$y[d$z=="t"] = 4 + .25*d$x[d$z=="t"] + rnorm(240)

d$x=(d$x - mean(d$x))/sd(d$x)

m=lmer(y ~ 1 + x + z + z:x + (1 + z|i), data = d)
summary(m)

	[[alternative HTML version deleted]]