[R-sig-ME] Grouped vs ungrouped binary data

Wed Apr 13 18:00:35 CEST 2011

Hello all,

I am trying to analyze binary matched data about approval rates of a
prime minister expressed by the same subjects on two different surveys
six months apart (example and data coming from Agresti, Sec. 12.1.3, p.
494). Following Agresti, and common sense, I want to fit the model

            Approval ~ survey + (1 | id)

I tried to analyze the data in a grouped form and then in a an ungrouped
form, but I am getting different results. Furthermore, Agresti has ML
estimate that are different from what I get in either way. Could anybody
help me understanding what I am doing wrong or I am not getting?

Here I set up the data set and fit the model in grouped form, using the
weights argument to glmer:

> approval <- matrix(c(794, 86, 150, 570), 2, 2,
+                    dimnames = list(First = c("Approve", "Disapprove"),
+                    Second = c("Approve", "Disapprove")))
> approval.wide <- as.data.frame(ftable(approval))
> approval.wide
       First     Second Freq
1    Approve    Approve  794
2 Disapprove    Approve   86
3    Approve Disapprove  150
4 Disapprove Disapprove  570
> approval.long <- reshape(approval.wide,
+                          timevar = "survey",
+                          times = c("First", "Second"),
+                          v.names = "Approval",
+                          varying = c("First", "Second"),
+                          direction = "long")
> approval.long
         Freq survey   Approval id
1.First   794  First    Approve  1
2.First    86  First Disapprove  2
3.First   150  First    Approve  3
4.First   570  First Disapprove  4
1.Second  794 Second    Approve  1
2.Second   86 Second    Approve  2
3.Second  150 Second Disapprove  3
4.Second  570 Second Disapprove  4
> ## fit the model
> summary(fit1 <- glmer(Approval ~ survey + (1 | id), family = binomial,
+                       data = approval.long, weights = Freq, nAGQ = 5))
Generalized linear mixed model fit by the adaptive Gaussian Hermite approximation 
Formula: Approval ~ survey + (1 | id) 
   Data: approval.long 
   AIC   BIC logLik deviance
 651.1 651.4 -322.6    645.1
Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 75.732   8.7024  
Number of obs: 8, groups: id, 4

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -0.6251     4.4607  -0.140    0.889    
surveySecond   1.1144     0.1911   5.831 5.53e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
            (Intr)
surveySecnd -0.021
Warning message:
In model.matrix.default(mt, mf, contrasts) :
  variable 'survey' converted to a factor

Then I try to fit the model on subject specific data:

> approval.subject <- data.frame(sapply(approval.long[, 2 : 3],
+                                       function(x) rep(x, approval.long$Freq)),
+                                id = rep(1 : sum(approval), 2))
> approval.subject$Approval <- factor(approval.subject$Approval,
+                                     labels = c("Approve", "Disapprove"))
> ## and fit the model
> summary(fit2 <- glmer(Approval ~ survey + (1 | id), family = binomial,
+                       data = approval.subject, nAGQ = 5))
Generalized linear mixed model fit by the adaptive Gaussian Hermite approximation 
Formula: Approval ~ survey + (1 | id) 
   Data: approval.subject 
  AIC  BIC logLik deviance
 4362 4380  -2178     4356
Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 11.230   3.3511  
Number of obs: 3200, groups: id, 1600

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -1.0093     0.1155  -8.742  < 2e-16 ***
surveySecond   0.4169     0.1075   3.880 0.000104 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
            (Intr)
surveySecnd -0.476

As you can see, I get different estimates for both the fixed effects and
the variance of the random effect. For comparison, Agresti reports the
MLE of the fixed effect (survey) to be -0.556 and the variance of the
random effect to be (5.16)^2. 

Thank you in advance,
Giovanni Petris

-- 

Giovanni Petris  <GPetris at uark.edu>
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/