[R-sig-ME] Modelling random effects for only part of the observations (in lme4)

Thu Jan 15 17:34:17 CET 2015

Dear list members,

I have a seemingly easy problem, though it turned out to be more difficult in practice. Basically, I'm wondering if it is possible in lme4 to model random effects for only *some* of the observations?

Here's my problem (somewhat simplified). Individuals are randomised to either treatment or no treatment. The treatment consists of group therapy, where the individuals are (randomly) assigned to groups. It is reasonable to expect some sort of group/cluster effect - e.g. a therapist effect and/or a within-group interaction effect for the individuals - and this effect can be modelled as a random effect. So far so good.

However, for the individuals randomised to no treatment, there are no groups, and thus no group effects. So basically (I think!) I can use the linear model (in mathematical notation)

  y_ij = intercept + b*x_ij + eps_ij

for the untreated individuals, and

  y_ij = intercept + b*x_ij + treatment + B_i + eps_ij

for the treated indivduals, where i are group indices, j are indices for the individuals, x_ij is some (baseline) covariate(s) and B_i are the random effects. (i is of course not really defined for the control individuals, so you can assume that all j indices are different for different individuals, and replace ij with j and i with i(j), if that makes the syntax easier to understand.)

Or, in lme4/lm syntax:

  y ~ x + treat_factor + (1|group) # Treated individuals
  y ~ x + treat_factor             # Untreated individuals

where treat_factor is a two-level factor (control/treatment).

The two *mathematical* linear predictor formulas are easy to combine into one:

  y_ij = intercept + b*x_ij + arm_ij*treatment + arm_ij*B_i + eps_ij

where arm_ij (indicating treatment/control arm) is 1 if the individual (i,j) was randomised to the treatment arm and 0 if he/she was randomised to the control arm.

But how do I write this in lme4 syntax?

I have thought about letting each individual in the control arm being its own cluster/group. But this doesn't seem realistic. Why would the variance between groups (in the treatment arm) be similar to the variance between individuals in control arm? It doesn't seem like a realistic model, and I believe it would bias the estimated treatment effect.

Is it even possible to fit these types of models? Or are there other R packages that can be used instead? (Note that for my actual data set I have a logistic, not a linear, model, but I doubt this makes things *easier* .)

Any help would be appreciated.

-- 
Karl Ove Hufthammer