[R] zero random effect sizes with binomial lmer

Sun Dec 31 19:50:40 CET 2006

I've found a way to make this problem, if it's not a bug, more clear. 
I've taken my original data set A and simply doubled it with 
AA<-rbind(A,A).

Doing so, instead of this:

Random effects: # A
  Groups  Name        Variance Std.Dev.
  Subject (Intercept) 1.63e+00 1.28e+00
  Item    (Intercept) 5.00e-10 2.24e-05
number of obs: 161, groups: Subject, 23; Item, 7

I get this:

Random effects: # AA
  Groups  Name        Variance Std.Dev.
  Subject (Intercept) 3.728    1.931
  Item    (Intercept) 0.319    0.565
number of obs: 322, groups: Subject, 23; Item, 7

I think I understand why the Subject effect increases. On a per-Response 
basis, there is more variation between Subjects. However, the 
probability (hence log odds, etc.) of the Response seems to be 
constant, so I am not sure. In any case, the ranef variance increases by 
a factor of 2.3, going from A to AA:

Response 1:0	Dataset A	Dataset AA
Subject  1	 0 :  7		 0 : 14
Subject  2	 0 :  7		 0 : 14
Subject  3	 0 :  7		 0 : 14
Subject  4	 0 :  7		 0 : 14
Subject  5	 0 :  7		 0 : 14
Subject  6	 0 :  7		 0 : 14
Subject  7	 0 :  7		 0 : 14
Subject  8	 0 :  7		 0 : 14
Subject  9	 0 :  7		 0 : 14
Subject 10	 0 :  7		 0 : 14
Subject 11	 0 :  7		 0 : 14
Subject 12	 0 :  7		 0 : 14
Subject 13	 1 :  6		 2 : 12
Subject 14	 1 :  6		 2 : 12
Subject 15	 1 :  6		 2 : 12
Subject 16	 1 :  6		 2 : 12
Subject 17	 1 :  6		 2 : 12
Subject 18	 1 :  6		 2 : 12
Subject 19	 1 :  6		 2 : 12
Subject 20	 2 :  5		 4 : 10
Subject 21	 3 :  4		 6 :  8
Subject 22	 4 :  3		 8 :  6
Subject 23	 4 :  3		 8 :  6

Ranef s^2	 1.630		 3.728

Comparing the Items, though, we have:

Response 1:0	Dataset A	Dataset AA
Item 1		 2 : 21		 4 : 42
Item 2		 2 : 21		 4 : 42
Item 3		 2 : 21		 4 : 42
Item 4		 4 : 19		 8 : 38
Item 5		 1 : 22		 2 : 44 
Item 6		 5 : 18		10 : 36 
Item 7		 4 : 19		 8 : 38

Ranef s^2:	 5.00xe-10	 0.319

Looking at this completely naively, I don't understand why whatever 
statistical difference doubling the data for each Item makes, shouldn't be 
similar to what happened above, for Subject. Instead, we have an estimate 
of zero for the smaller data set.

I realize that I don't understand the actual mathematics by which the 
BLUPs (and thus the variances) are calculated, but something seems wrong 
here. Obviously between A and AA nothing changes as far as the interaction 
of Subject and Item in particular combinations.

How should i best understand this behavior of lmer, and what is my best 
advice for obtaining reliable random effect size estimates from data like 
this? Are the estimates at least reliable when they are NOT zero?

Thanks,
Daniel

P.S. I'll point out again that the by multiplying the "zero" ranef 
estimate for A by an enormous constant (see blow), it is almost identical 
to thet for AA. Obviously these coefficients derive from the marginal 
proportions in the data. And I understand that with small N's, the random 
effect may not be signifiicant. But that shouldn't mean that the estimate 
is zero, right (compare fixed effect coefficients, which are often 
non-zero, yet not significant...)

a1 <- c(0,0,0,0,0,0,0)
a2 <- c(0,0,0,0,0,0,0)
a3 <- c(0,0,0,0,0,0,0)
a4 <- c(0,0,0,0,0,0,0)
a5 <- c(0,0,0,0,0,0,0)
a6 <- c(0,0,0,0,0,0,0)
a7 <- c(0,0,0,0,0,0,0)
a8 <- c(0,0,0,0,0,0,0)
a9 <- c(0,0,0,0,0,0,0)
a10 <- c(0,0,0,0,0,0,0)
a11 <- c(0,0,0,0,0,0,0)
a12 <- c(0,0,0,0,0,0,0)
a13 <- c(0,0,0,0,0,0,1)
a14 <- c(0,0,0,0,0,0,1)
a15 <- c(0,0,0,0,0,1,0)
a16 <- c(0,0,0,0,1,0,0)
a17 <- c(0,0,0,1,0,0,0)
a18 <- c(0,0,1,0,0,0,0)
a19 <- c(0,1,0,0,0,0,0)
a20 <- c(0,1,0,0,0,1,0)
a21 <- c(0,0,0,1,0,1,1)
a22 <- c(1,0,0,1,0,1,1)
a23 <- c(1,0,1,1,0,1,0)
aa <- 
rbind(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23)

a <- array(0, c(161,3), list(NULL, c("Subject","Item","Response")))
 	for (s in c(1:23))
 		for (i in c(1:7))
 			a[7*(s-1)+i,] <- c(s,i,aa[s,i])

A <- data.frame(a)
AA <- rbind(A,A)

A.fit <- lmer(Response~(1|Subject)+(1|Item),A,binomial)
AA.fit <- lmer(Response~(1|Subject)+(1|Item),B,binomial)

A.fit
AA.fit

ranef(A.fit)$Item
(ranef(A.fit)$Item)*578500000
ranef(AA.fit)$Item