[R-sig-ME] how reliable are inferences drawn from binomial models for small datasets fitted with lme4?

Sun Jul 5 21:51:05 CEST 2009

This post may be of interest in light of the recent discussion of PQL  
versus Laplace-approximated likelihood.  I'm facing an interestingly  
challenging analysis of a relatively small (190-observation) binary- 
response dataset with a single two-level treatment and two crossed  
random factors (call them F1 and F2).  The question of current  
interest is whether I can infer a difference in fixed effect of  
treatment; it is hard to figure out the right conclusions to draw.   
I'm putting the full dataset at the end of the message, but I'll start  
by presenting some summary statistics:

 > xtabs(~ Treatment + Response, dat)
          Response
Treatment   0   1
         1  12 117
         2   0  61

Fisher's exact test says "yes, the treatment has an effect" at  
p=0.01.  But of course that doesn't take into account the crossed  
hierarchical structure of the data.  Some statistics on this:

 > with(dat,tapply(Response, list(F1,Treatment),length))
     1  2
1  11  5
2   7  6
3   9  3
4   4 NA
5   3  3
6  11 NA
7   3  3
8   9  4
9  11  3
10 12  6
11 11  8
12  5  6
13  1 NA
14 11  1
15 12  9
16  9  4
 > with(dat,tapply(Response, list(F1,Treatment),mean))
            1  2
1  1.0000000  1
2  0.5714286  1
3  0.8888889  1
4  1.0000000 NA
5  0.0000000  1
6  1.0000000 NA
7  0.6666667  1
8  1.0000000  1
9  1.0000000  1
10 1.0000000  1
11 0.8181818  1
12 0.6000000  1
13 1.0000000 NA
14 1.0000000  1
15 1.0000000  1
16 1.0000000  1
 > with(dat,tapply(Response, list(F2,Treatment),length))
    1  2
1  4  2
2  8  2
3  3  1
4  5  2
5  6  3
6  7  2
7  5  5
8  8  1
9  6  6
10 5  2
11 6 NA
12 6  6
13 4  1
14 6  2
15 4  6
16 5  2
17 5  2
18 4  1
19 5 NA
20 4 NA
21 5  6
22 6  3
23 6  5
24 6  1
 > with(dat,tapply(Response, list(F2,Treatment),mean))
            1  2
1  1.0000000  1
2  0.8750000  1
3  1.0000000  1
4  1.0000000  1
5  1.0000000  1
6  1.0000000  1
7  0.8000000  1
8  0.8750000  1
9  0.8333333  1
10 1.0000000  1
11 0.8333333 NA
12 0.6666667  1
13 1.0000000  1
14 0.8333333  1
15 1.0000000  1
16 1.0000000  1
17 1.0000000  1
18 1.0000000  1
19 1.0000000 NA
20 1.0000000 NA
21 1.0000000  1
22 0.6666667  1
23 0.8333333  1
24 0.8333333  1

So there is some sign of inter-cluster variability, though of course  
the binary response makes it hard to see.  Crossed random-slope models  
converge both with and without a fixed effect of treatment...

 > print(m1 <- lmer(Response ~ Treatment + (Treatment - 1 | F1) +  
(Treatment - 1 | F2), dat, family=binomial))
Generalized linear mixed model fit by the Laplace approximation
Formula: Response ~ Treatment + (Treatment - 1 | F1) + (Treatment - 1  
|      F2)
    Data: dat
    AIC   BIC logLik deviance
  82.65 108.6 -33.33    66.65
Random effects:
  Groups Name       Variance   Std.Dev.   Corr
  F2     Treatment1 4.1354e-12 2.0336e-06
         Treatment2 1.0492e+00 1.0243e+00 0.000
  F1     Treatment1 9.4589e+00 3.0755e+00
         Treatment2 6.9945e-01 8.3633e-01 0.000
Number of obs: 190, groups: F2, 24; F1, 16

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    3.943      1.004   3.929 8.54e-05 ***
Treatment2    17.289   5221.110   0.003    0.997
 > print(m0 <- lmer(Response ~ 1 + (Treatment - 1 | F1) + (Treatment -  
1 | F2), dat, family=binomial))
Generalized linear mixed model fit by the Laplace approximation
Formula: Response ~ 1 + (Treatment - 1 | F1) + (Treatment - 1 | F2)
    Data: dat
   AIC   BIC logLik deviance
  81.5 104.2 -33.75     67.5
Random effects:
  Groups Name       Variance   Std.Dev.   Corr
  F2     Treatment1 0.0000e+00 0.00000000
         Treatment2 1.7654e-08 0.00013287   NaN
  F1     Treatment1 2.3858e+01 4.88443481
         Treatment2 3.0185e-02 0.17373716 -1.000
Number of obs: 190, groups: F2, 24; F1, 16

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    5.855      1.384   4.232 2.32e-05 ***

...but I am concerned (a) about the NaN correlation for F2 (this must  
be a sign of something broken somewhere!?), and (b) about the minimal  
difference in log-likelihood for m1 versus m0.  (Note that it's  
possible to fix (a) by dropping random effects of F2 from m0, and the  
log-likelihood doesn't change.)  It seems hard to believe that the  
marginal log-likelihoods of the models given estimates of the fixed  
effects and of the random-effect covariance matrices differs by only  
0.42.  I draw further evidence for this conclusion by imagining that  
one response in Treatment 2 were a failure rather than a success:

 > dat[142,"Response"] <- 0
 > logLik(lmer(Response ~ Treatment + (Treatment - 1 | F1) +  
(Treatment - 1 | F2), dat, family=binomial))
'log Lik.' -35.29494 (df=8)
 > logLik(lmer(Response ~ 1 + (Treatment - 1 | F1) + (Treatment - 1 |  
F2), dat, family=binomial))
'log Lik.' -37.60635 (df=7)

Now there's a considerable difference in log-likelihood; this would  
turn out significant in a chi-squared test (though of course we'd want  
to calibrate with simulation if this were the actual dataset).

Note that I can't use the Wald statistic here to address my hypothesis  
for the real dataset, because the lack of failure responses in  
Treatment 2 blows up the MLE and also the standard error of its  
parameter.  I was hoping to use a likelihood-ratio comparison (perhaps  
calibrated with simulation), but it seems to me that the minimal  
observed difference in likelihood ratio for the actual dataset throws  
that idea out the window.

So here are some questions:

1) Is there something specific to the Laplace approximation being used  
that renders logit models with extreme response proportions generally  
unreliable?  (I can't check this directly against other packages  
because of the crossed random effects.)

2) Is this dataset simply small enough that the uncertainty in  
estimating of the random-effect variance pararameters makes point- 
estimate inference on fixed effects unreliable, and I should go the  
Bayesian route and integrate over the uncertainty?  Following this  
possibility, I implemented the fixed-effect-of-treatment model (m1) in  
JAGS with diffuse normal priors over the intercept and Treatment 2  
parameter, and diffuse Wishart priors over the random-effect  
covariance matrices.  Here are 95% Bayesian confidence intervals that  
result:

                       lower        upper
Omega.F2[1,1]  5.316465e+02 5.922489e+05
Omega.F2[2,1] -3.016791e+05 2.899492e+05
Omega.F2[1,2] -3.016791e+05 2.899492e+05
Omega.F2[2,2]  2.116783e+02 5.911579e+05
Omega.F1[1,1]  5.186791e+01 5.979535e+05
Omega.F1[2,1] -3.073216e+05 2.898743e+05
Omega.F1[1,2] -3.073216e+05 2.898743e+05
Omega.F1[2,2]  2.764357e+02 5.972545e+05
intercept      1.749505e+00 2.938812e+00
Treatment2     1.240127e+00 6.279801e+02

and here is the posterior mode:

Omega.F2[1,1]  70299.472805
Omega.F2[2,1]  -1324.272662
Omega.F2[1,2]  -1324.272662
Omega.F2[2,2]  70810.193188
Omega.F1[1,1]  70426.234086
Omega.F1[2,1] -46977.070058
Omega.F1[1,2] -46977.070058
Omega.F1[2,2]  87987.091315
intercept      1.977808
Treatment2     5.846867

Some notable differences between these Bayesian results and the lme4- 
fit m1:

* lme4 inferred considerably larger random effects of F1 than of F2,  
which doesn't seem supported by the marginal means I listed near the  
beginning of the email; whereas the Bayesian inference puts F1 and F2  
on roughly equal footing;

* the severe lack of evidence regarding the correlations of random  
effects across treatment is clearly evident in the Bayesian confidence  
intervals;

* the fixed-effects posterior modes are much closer to zero than the  
Laplace-approximated MLEs;

* The Bayesian inference seems strongly supportive of a fixed effect  
of treatment, whereas lme4 superficially seemed to lead to the  
opposite conclusions.

I'd love to hear people's thoughts on my approach to these data.  Many  
thanks in advance.

Roger

**

     Treatment F1 F2 Response
1           1  6  1        1
2           1  7  1        1
3           1  8  1        1
4           1 15  1        1
5           1  1  2        1
6           1  2  2        1
7           1  3  2        1
8           1  4  2        1
9           1  9  2        1
10          1 10  2        1
11          1 11  2        1
12          1 12  2        0
13          1  6  3        1
14          1 14  3        1
15          1 15  3        1
16          1  3  4        1
17          1  4  4        1
18          1  9  4        1
19          1 10  4        1
20          1 11  4        1
21          1  6  5        1
22          1  8  5        1
23          1 13  5        1
24          1 14  5        1
25          1 15  5        1
26          1 16  5        1
27          1  1  6        1
28          1  2  6        1
29          1  3  6        1
30          1  9  6        1
31          1 10  6        1
32          1 11  6        1
33          1 12  6        1
34          1  5  7        0
35          1  6  7        1
36          1 14  7        1
37          1 15  7        1
38          1 16  7        1
39          1  1  8        1
40          1  2  8        1
41          1  3  8        1
42          1  4  8        1
43          1  9  8        1
44          1 10  8        1
45          1 11  8        0
46          1 12  8        1
47          1  5  9        0
48          1  7  9        1
49          1  8  9        1
50          1 14  9        1
51          1 15  9        1
52          1 16  9        1
53          1  1 10        1
54          1  3 10        1
55          1  9 10        1
56          1 10 10        1
57          1 11 10        1
58          1  5 11        0
59          1  6 11        1
60          1  8 11        1
61          1 14 11        1
62          1 15 11        1
63          1 16 11        1
64          1  1 12        1
65          1  2 12        1
66          1  9 12        1
67          1 10 12        1
68          1 11 12        0
69          1 12 12        0
70          1  6 13        1
71          1 14 13        1
72          1 15 13        1
73          1 16 13        1
74          1  1 14        1
75          1  2 14        0
76          1  3 14        1
77          1 10 14        1
78          1 11 14        1
79          1 12 14        1
80          1  6 15        1
81          1  8 15        1
82          1 14 15        1
83          1 15 15        1
84          1  1 16        1
85          1  3 16        1
86          1  4 16        1
87          1  9 16        1
88          1 10 16        1
89          1  6 17        1
90          1  8 17        1
91          1 14 17        1
92          1 15 17        1
93          1 16 17        1
94          1  1 18        1
95          1  9 18        1
96          1 10 18        1
97          1 11 18        1
98          1  6 19        1
99          1  8 19        1
100         1 14 19        1
101         1 15 19        1
102         1 16 19        1
103         1  1 20        1
104         1  9 20        1
105         1 10 20        1
106         1 11 20        1
107         1  6 21        1
108         1  8 21        1
109         1 14 21        1
110         1 15 21        1
111         1 16 21        1
112         1  1 22        1
113         1  2 22        0
114         1  3 22        0
115         1  9 22        1
116         1 10 22        1
117         1 11 22        1
118         1  6 23        1
119         1  7 23        0
120         1  8 23        1
121         1 14 23        1
122         1 15 23        1
123         1 16 23        1
124         1  1 24        1
125         1  2 24        0
126         1  3 24        1
127         1  9 24        1
128         1 10 24        1
129         1 11 24        1
130         2  3  1        1
131         2 11  1        1
132         2 15  2        1
133         2 16  2        1
134         2  2  3        1
135         2  7  4        1
136         2 15  4        1
137         2 10  5        1
138         2 11  5        1
139         2 12  5        1
140         2  8  6        1
141         2 15  6        1
142         2  1  7        0
143         2  2  7        1
144         2 10  7        1
145         2 11  7        1
146         2 12  7        1
147         2  8  8        1
148         2  1  9        1
149         2  2  9        1
150         2  9  9        1
151         2 10  9        1
152         2 11  9        1
153         2 12  9        1
154         2  5 10        1
155         2  8 10        1
156         2  5 12        1
157         2  7 12        1
158         2  8 12        1
159         2 14 12        1
160         2 15 12        1
161         2 16 12        1
162         2  1 13        1
163         2  5 14        1
164         2 15 14        1
165         2  2 15        1
166         2  3 15        1
167         2  9 15        1
168         2 10 15        1
169         2 11 15        1
170         2 12 15        1
171         2 15 16        1
172         2 16 16        1
173         2 11 17        1
174         2 12 17        1
175         2 15 18        1
176         2  1 21        1
177         2  2 21        1
178         2  3 21        1
179         2  9 21        1
180         2 10 21        1
181         2 11 21        1
182         2  7 22        1
183         2 15 22        1
184         2 16 22        1
185         2  1 23        1
186         2  2 23        1
187         2 10 23        1
188         2 11 23        1
189         2 12 23        1
190         2 15 24        1

--

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy