[R-sig-ME] generalized mixed linear models, glmmPQL and GLMER give very different results that both do not fit the data well...

Sat Mar 19 12:17:52 CET 2011

Thanks for the insightful reply! I've studied linear mixed models at undergraduate level, but I was unfamiliar with GLMMs. It was kind of hard at first to find out how to analyze these data, but I'm glad I found this group. I'm also trying to read the book on ecology by Zuur (2009).

My data come from an experiment in which people have to solve two types of reasoning problems, 4 of each kind. The noconflict problems are filler problems which are very easy, the conflict problems are harder, some people will get all 4 wrong, some will get all 4 right. I could leave the no conflict problems out of the analysis, but I want to conclude that the power manipulation had no effect on performance on noconflict problems, but it did on the conflict problems (to rule out alternative explanations such as decreased motivation/attention). I think an analysis on the full data set is preferred over 2 separate analyses per problem type.

I made the histograms and this confirms your expectations: for the no conflict problems, in all  three groups nobody scores less than 3/4, which is also evident from the means in these cells. For the conflict problems, the data have a similar distribution in all three groups, most mass is on 4/4, with 0/4, 1/4, 2/4, 3/4 getting almost equal mass. So indeed, variability is large.

I tried to fit some models with effect of type also varying over subjects, because I thought this would make sense intuitively (some people will be very good at noconflict, but very bad at conflict; others will be good at both), but then I consistently get correlations between random effects of -1, which I thought was an indication of convergence problems, so I left this term out.

I have read about the different estimation approaches used in GLMM's and I've noticed experts recommend not using PQL. So I will now focus on Laplace and GHQ (I knew this was possible in glmer, but did not know how much nAGQ should be, therefore I also used glmmML). nAGQ 7 and 9 gives almost completely the same results, but they differ a bit from laplace:

Generalized linear mixed model fit by the adaptive Gaussian Hermite approximation 
Formula: accuracy ~ f_power * f_type + (1 | f_subject) 
   Data: syllogisms 
   AIC   BIC logLik deviance
 441.5 473.2 -213.7    427.5
Random effects:
 Groups    Name        Variance Std.Dev.
 f_subject (Intercept) 4.7286   2.1745  
Number of obs: 688, groups: f_subject, 86

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                  1.470526   0.493530   2.980  0.00289 ** 
f_powerhp                    0.124439   0.690869   0.180  0.85706    
f_powerlow                   2.020807   0.833203   2.425  0.01529 *  
f_typenoconflict             3.267835   0.643001   5.082 3.73e-07 ***
f_powerhp:f_typenoconflict   0.219783   0.927490   0.237  0.81268    
f_powerlow:f_typenoconflict -0.005484   1.455748  -0.004  0.99699  

and

Generalized linear mixed model fit by the Laplace approximation 
Formula: accuracy ~ f_power * f_type + (1 | f_subject) 
   Data: syllogisms 
 AIC   BIC logLik deviance
 406 437.7   -196      392
Random effects:
 Groups    Name        Variance Std.Dev.
 f_subject (Intercept) 4.9968   2.2353  
Number of obs: 688, groups: f_subject, 86

Fixed effects:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)                  1.50745    0.50507   2.985  0.00284 ** 
f_powerhp                    0.13083    0.70719   0.185  0.85323    
f_powerlow                   2.04121    0.85308   2.393  0.01672 *  
f_typenoconflict             3.28715    0.64673   5.083 3.72e-07 ***
f_powerhp:f_typenoconflict   0.21680    0.93165   0.233  0.81599    
f_powerlow:f_typenoconflict -0.01199    1.45807  -0.008  0.99344    

I don't really know what criterion I should use to choose between models. Are any of the two actually useful, or are my data not fit for this kind of analysis? Is there another kind of analysis I could try? 

Thanks a lot for your time. I also think I understand the comment on the random effects :)

-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Saturday 19 March 2011 10:02 AM
To: David Duffy
Cc: Franssens, Samuel; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] generalized mixed linear models, glmmPQL and GLMER give very different results that both do not fit the data well...

On Fri, Mar 18, 2011 at 9:54 PM, David Duffy <davidD at qimr.edu.au> wrote:
> On Fri, 18 Mar 2011, Franssens, Samuel wrote:
>
>> Hi,
>>
>> I have the following type of data: 86 subjects in three independent 
>> groups (high power vs low power vs control). Each subject solves 8 
>> reasoning problems of two kinds: conflict problems and noconflict 
>> problems. I measure accuracy in solving the reasoning problems. To 
>> summarize: binary response, 1 within subject var (TYPE), 1 between subject var (POWER).
>>
>> I wanted to fit the following model: for problem i, person j:
>> logodds ( Y_ij ) = b_0j + b_1j TYPE_ij with b_0j = b_00 + b_01 
>> POWER_j + u_0j and b_1j = b_10 + b_11 POWER_j
>>
>> I think it makes sense, but I'm not sure.
>> Here are the observed cell means:
>>            conflict       noconflict
>> control     0.6896552      0.9568966
>> high        0.6935484      0.9677419
>> low         0.8846154      0.9903846
>>
>> GLMER gives me:
>> Formula: accuracy ~ f_power * f_type + (1 | subject)
>>  Data: syllogisms
>> Random effects:
>> Groups  Name        Variance Std.Dev.
>> subject (Intercept) 4.9968   2.2353
>> Number of obs: 688, groups: subject, 86
>>
>> Fixed effects:
>>                           Estimate Std. Error z value Pr(>|z|)
>> (Intercept)                  1.50745    0.50507   2.985  0.00284 ** 
>> f_powerhp                    0.13083    0.70719   0.185  0.85323 
>> f_powerlow                   2.04121    0.85308   2.393  0.01672 * 
>> f_typenoconflict             3.28715    0.64673   5.083 3.72e-07 *** 
>> f_powerhp:f_typenoconflict   0.21680    0.93165   0.233  0.81599 
>> f_powerlow:f_typenoconflict -0.01199    1.45807  -0.008  0.99344
>> ---
>>
>> Strange thing is that when you convert the estimates to 
>> probabilities, they are quite far off. For control, no conflict 
>> (intercept), the estimation from glmer is 1.5 -> 81% and for glmmPQL 
>> is 1.14 -> 75%, whereas the observed is: 68%.
>>
>> Am I doing something wrong?
>
> You are forgetting that your model includes a random intercept for 
> each subject.

To follow up on David's comment a bit, you have a very large estimated standard deviation for the random effects for subject.  An estimated standard deviation of 2.23 in the glmer fit is huge.  On the scale of the linear predictor the random effects would swamp the contribution of the fixed effects - different subjects' random effects could easily differ by 8 or more (+/ 2 s.d.).  Transformed to the probability scale that takes you from virtually impossible to virtually certain.  To get that large an estimated standard deviation you would need different subjects exposed to the same experimental conditions and who get completely different results (all wrong vs all right).  The estimate of the fixed effect for type is also very large, considering that this is a binary covariate.

I would suggest plotting the data, say by first grouping the subjects into groups according to power then by subject then by type.  You should see a lot of variability in  responses according to subject within power.

Regarding the difference between the glmer and the glmmPQL results, bear in mind that glmmPQL is based on successive approximation of the GLMM by an LMM.  This case is rather extreme and the approximation is likely to break down.  In fact, for the glmer fit, I would recommend using nAGQ=7 or nAGQ=9 to see if that changes the estimates substantially.  The difference between the default evaluation of the deviance by the Laplace approximation and the more accurate adaptive Gauss-Hermite quadrature may be important in this case.

I just saw your other message about the properties of the random effects.  The values returned are the conditional modes of the random effects given the observed data.  The mean 0 and standard deviation of
2.3 apply to the marginal or unconditional distribution of the random effects, not the conditional distribution.