[R-sig-ME] Zhang 2011 (re)analysis
Reinhold Kliegl
reinhold.kliegl at gmail.com
Mon Oct 31 17:15:45 CET 2011
Here is some comparison between glm, glmer (using lme4 with Laplace)
and sabreR (which uses AGHQ, if I recall correctly).
The glm analysis replicates exactly the results reported in Everitt
and Hothorn (2002, Table 13.1, around page 175). Thus, I am pretty
sure I am using the correct data.
sabreR gives estimates both for the standard homogeneous model,
replicating the glm(), as well as a random effects model, replicating
glmer() pretty closely, I think
Reinhold
> # HSAUR contains respiratory data
> data("respiratory", package = "HSAUR")
>
> # Convert pretest to covariate "baseline"
> resp <- subset(respiratory, month > "0" )
> resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
>
> # Numeric variants of factors
> resp$centr.b <- as.integer(resp$centre) - 1
> resp$treat.b <- as.integer(resp$treatment) - 1
> resp$sex.b <- as.integer(resp$sex)-1
> resp$pre.b <- as.integer(resp$baseline) - 1
> resp$status.b <- as.integer(resp$status) - 1
> resp$id <- as.integer(resp$subject)
>
> # Pooled estimate
> summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b + age, family="binomial", data=resp))
Call:
glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
age, family = "binomial", data = resp)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3146 -0.8551 0.4336 0.8953 1.9246
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.900171 0.337653 -2.666 0.00768 **
centr.b 0.671601 0.239567 2.803 0.00506 **
treat.b 1.299216 0.236841 5.486 4.12e-08 ***
sex.b 0.119244 0.294671 0.405 0.68572
pre.b 1.882029 0.241290 7.800 6.20e-15 ***
age -0.018166 0.008864 -2.049 0.04043 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 608.93 on 443 degrees of freedom
Residual deviance: 483.22 on 438 degrees of freedom
AIC: 495.22
Number of Fisher Scoring iterations: 4
>
> # glmer
> library(lme4)
> print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex + baseline + age + (1|subject),
+ family=binomial,data=resp), cor=FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: status ~ centre + treatment + sex + baseline + age + (1 | subject)
Data: resp
AIC BIC logLik deviance
443 471.7 -214.5 429
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 3.8647 1.9659
Number of obs: 444, groups: subject, 111
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.64438 0.75829 -2.169 0.0301 *
centre2 1.04382 0.53193 1.962 0.0497 *
treatmenttreatment 2.15746 0.51757 4.168 3.07e-05 ***
sexmale 0.20194 0.66117 0.305 0.7600
baselinegood 3.06990 0.52608 5.835 5.37e-09 ***
age -0.02540 0.01998 -1.271 0.2037
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> # Pooled and clustered
> library(sabreR)
> attach(resp)
> m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b + age, case=id)
> print(m_sabreR)
# ... deleted some output
(Standard Homogenous Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -0.90017 0.33765 -2.6660
centr.b 0.67160 0.23957 2.8034
treat.b 1.2992 0.23684 5.4856
sex.b 0.11924 0.29467 0.40467
pre.b 1.8820 0.24129 7.7999
age -0.18166E-01 0.88644E-02 -2.0493
(Random Effects Model)
Parameter Estimate Std. Err. Z-score
____________________________________________________________________
(intercept) -1.6642 0.84652 -1.9660
centr.b 0.99044 0.56561 1.7511
treat.b 2.1265 0.57198 3.7177
sex.b 0.18166 0.70814 0.25653
pre.b 2.9987 0.60174 4.9834
age -0.22949E-01 0.21337E-01 -1.0755
scale 1.9955 0.32093 6.2180
# ... deleted some output
> detach()
On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com> wrote:
> On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
>> One problem appears to be that 111 id's are renumbered from 1 to 55
>> (56) in the two groups.
>> Unfortunately, it also appears that there is no unique mapping to
>> treatment groups. So there are some subjects with 8 values assigned to
>> one of the groups.
>
> Thanks. It looks like IDs are nested within center (not within
> treatment). That doesn't seem to change the story very much (as far as
> , though (Zhang et al don't report estimated random-effect variances ...)
>
>
>
>>> library(geepack)
>>> data(respiratory)
>>> resp1 <- respiratory
>>> resp1 <- transform(resp1,
>> + center=factor(center),
>> + id=factor(id))
>>>
>>> str(resp1)
>> 'data.frame': 444 obs. of 8 variables:
>> $ center : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> $ id : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
>> $ treat : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ...
>> $ sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
>> $ age : int 46 46 46 46 28 28 28 28 23 23 ...
>> $ baseline: int 0 0 0 0 0 0 0 0 1 1 ...
>> $ visit : int 1 2 3 4 1 2 3 4 1 2 ...
>> $ outcome : int 0 0 0 0 0 0 0 0 1 1 ...
>>> detach("package:geepack") ## allow detaching of doBy
>>> detach("package:doBy") ## allow detaching of lme4
>>
>> The data appear also in the HSAUR package, here the 111 subjects
>> identified with 5 months (visits) each. I suspect month 0 was used as
>> baseline.
>>> library(HSAUR)
>>> data(respiratory)
>>> resp2 <- respiratory
>>>
>>> str(resp2)
>> 'data.frame': 555 obs. of 7 variables:
>> $ centre : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1 1 1 ...
>> $ sex : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ...
>> $ age : num 46 46 46 46 46 28 28 28 28 28 ...
>> $ status : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ...
>> $ month : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2 3 4 5 ...
>> $ subject : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ...
>>
>> Reinhold
>>
>> On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
>>>
>>> There's a fairly recent paper by Zhang et al (2011) of interest to
>>> folks on this list
>>>
>>> DOI: 10.1002/sim.4265
>>>
>>> In response to a post on the AD Model Builder users' list, I took a
>>> quick shot at re-doing some of their results (they have extensive
>>> simulation results, which I haven't tried to replicate yet, and an
>>> analysis of binary data from Davis (1991) which is included (I *think*
>>> it's the same data set -- the description and size of the data set match
>>> exactly) in the geepack data set).
>>>
>>> If anyone's interested, my results so far are posted at
>>>
>>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw
>>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf
>>>
>>> So far the R approaches I've tried agree closely with each other and
>>> with glmmADMB (except MASS::glmmPQL, which I expected to be different --
>>> the rest all use either Laplace approx. or AGHQ). They *don't* agree
>>> with the results Zhang et al got, yet -- I'm sure there's something I'm
>>> missing in the contrasts or otherwise ...
>>>
>>> Suggestions or improvements are welcome.
>>>
>>> cheers
>>> Ben Bolker
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>
>
More information about the R-sig-mixed-models
mailing list