[R-sig-ME] A strange design; impossible to model using mixed models?

Sun Jun 6 15:16:01 CEST 2010

On Sat, Jun 5, 2010 at 8:22 PM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
> Hi folks,
>
> This is likely a case of an experimentalist (not me!) failing to
> consult a statistician before applying an experimental design. I have
> data where I'm interested in 3 theoretical variables derived from 2
> manipulated variables, all measured within-Ss:
>
> #set up the design
> a = expand.grid(
>        s = 1:100 #subjects
>        , x = 1:4 #one predictor variable
>        , y = 1:3 #second predictor variable
> )
> a$z = rnorm(nrow(a)) #fake data
>
> #recode predictor variables
> a$v1 = ifelse(a$x==1,-.5,ifelse(a$x==2,.5,NA)) #based on x
> a$v2 = ifelse(a$x==3,-.5,ifelse(a$x==4,.5,NA)) #based on x
> a$v3 = ifelse(a$y==1,-.5,ifelse(a$y==2,.5,NA)) #based on y
>
>
> Now, obviously with such formulation, it's impossible to compute an
> interaction between v1 and v2 because they're non-orthogonal, but it
> is possible to compute the interactions between v1 & v3, and between
> v2 and v3:
>
> fit1 = lmer(
>        z ~ v1 * v3 + ( 1 | s )
>        , data = a
> )
>
> fit2 = lmer(
>        z ~ v2 * v3 + ( 1 | s )
>        , data = a
> )
>
> But I feel like it should be possible to do both above models
> simultaneously (gaining accuracy in the estimation of the main effect
> of v3), as:
>
> fit = lmer(
>        z ~
>                v1 +
>                v2 +
>                v3 +
>                v1:v3 +
>                v2:v3 +
>                ( 1 | s )
>        , data = a
> )
>
> However, I get the error:
> Error in function (fr, FL, start, REML, verbose)  :
>  Number of levels of a grouping factor for the random effects
> must be less than the number of observations
>
> Is it simply impossible to model the three variables simultaneously?

The way you have specified those variables, yes.
> str(a)
'data.frame':	1200 obs. of  7 variables:
 $ s : int  1 2 3 4 5 6 7 8 9 10 ...
 $ x : int  1 1 1 1 1 1 1 1 1 1 ...
 $ y : int  1 1 1 1 1 1 1 1 1 1 ...
 $ z : num  1.629 1.684 0.115 0.723 -0.415 ...
 $ v1: num  -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 ...
 $ v2: num  NA NA NA NA NA NA NA NA NA NA ...
 $ v3: num  -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 ...
 - attr(*, "out.attrs")=List of 2
  ..$ dim     : Named int  100 4 3
  .. ..- attr(*, "names")= chr  "s" "x" "y"
  ..$ dimnames:List of 3
  .. ..$ s: chr  "s=  1" "s=  2" "s=  3" "s=  4" ...
  .. ..$ x: chr  "x=1" "x=2" "x=3" "x=4"
  .. ..$ y: chr  "y=1" "y=2" "y=3"

> str(model.matrix(~ v1 + v2, a))
 num[0 , 1:3]
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "(Intercept)" "v1" "v2"
 - attr(*, "assign")= int [1:3] 0 1 2

If you include both v1 and v2 in the model there are 0 observations
without missing data in one of the covariates.