[R-sig-ME] Zhang 2011 (re)analysis

Juliet Hannah juliet.hannah at gmail.com
Tue Nov 1 04:10:52 CET 2011


The reference is given on the last page of the pdf from the original post.

Zhang, H., N. Lu, C. Feng, S.W. Thurston, Y. Xia, L. Zhu, and X. M. Tu (2011).
On fitting generalized linear mixed-effects models for binary responses using
different statistical packages. Statistics in Medicine.

On Mon, Oct 31, 2011 at 7:49 PM, Saang-Yoon Hyun <shyunuw at gmail.com> wrote:
> Hi,
> Could you show the reference: e.g., authors, year, tilte, journal, pages,
> etc.?   If you could share its PDF version of the paper, it would be more
> helpful.
> Thank you,
> Saang-Yoon
>
> On Mon, Oct 31, 2011 at 12:15 PM, Reinhold Kliegl <reinhold.kliegl at gmail.com
>> wrote:
>
>> Here is some comparison between glm,  glmer (using lme4 with Laplace)
>> and sabreR (which uses AGHQ, if I recall correctly).
>> The glm analysis replicates exactly the results reported in Everitt
>> and Hothorn (2002, Table 13.1, around page 175).  Thus, I am pretty
>> sure I am using the correct data.
>>
>>  sabreR gives estimates both for the standard homogeneous model,
>> replicating the glm(),  as well as a random effects model, replicating
>> glmer() pretty closely, I think
>>
>> Reinhold
>>
>>
>> > # HSAUR contains respiratory data
>> > data("respiratory", package = "HSAUR")
>> >
>> > # Convert pretest to covariate "baseline"
>> > resp <- subset(respiratory, month > "0" )
>> > resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
>> >
>> > # Numeric variants of factors
>> > resp$centr.b <- as.integer(resp$centre) - 1
>> > resp$treat.b <- as.integer(resp$treatment) - 1
>> > resp$sex.b <- as.integer(resp$sex)-1
>> > resp$pre.b <- as.integer(resp$baseline) - 1
>> > resp$status.b <- as.integer(resp$status) - 1
>> > resp$id <- as.integer(resp$subject)
>> >
>> > # Pooled estimate
>> > summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b +
>> age, family="binomial", data=resp))
>>
>> Call:
>> glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
>>    age, family = "binomial", data = resp)
>>
>> Deviance Residuals:
>>    Min       1Q   Median       3Q      Max
>> -2.3146  -0.8551   0.4336   0.8953   1.9246
>>
>> Coefficients:
>>             Estimate Std. Error z value Pr(>|z|)
>> (Intercept) -0.900171   0.337653  -2.666  0.00768 **
>> centr.b      0.671601   0.239567   2.803  0.00506 **
>> treat.b      1.299216   0.236841   5.486 4.12e-08 ***
>> sex.b        0.119244   0.294671   0.405  0.68572
>> pre.b        1.882029   0.241290   7.800 6.20e-15 ***
>> age         -0.018166   0.008864  -2.049  0.04043 *
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> (Dispersion parameter for binomial family taken to be 1)
>>
>>    Null deviance: 608.93  on 443  degrees of freedom
>> Residual deviance: 483.22  on 438  degrees of freedom
>> AIC: 495.22
>>
>> Number of Fisher Scoring iterations: 4
>>
>> >
>> > # glmer
>> > library(lme4)
>> > print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex + baseline
>> + age + (1|subject),
>> +       family=binomial,data=resp), cor=FALSE)
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: status ~ centre + treatment + sex + baseline + age + (1 | subject)
>>   Data: resp
>>  AIC   BIC logLik deviance
>>  443 471.7 -214.5      429
>> Random effects:
>>  Groups  Name        Variance Std.Dev.
>>  subject (Intercept) 3.8647 1.9659
>> Number of obs: 444, groups: subject, 111
>>
>> Fixed effects:
>>                   Estimate Std. Error z value Pr(>|z|)
>> (Intercept)        -1.64438    0.75829  -2.169   0.0301 *
>> centre2             1.04382    0.53193   1.962   0.0497 *
>> treatmenttreatment  2.15746    0.51757   4.168 3.07e-05 ***
>> sexmale             0.20194    0.66117   0.305   0.7600
>> baselinegood        3.06990    0.52608   5.835 5.37e-09 ***
>> age                -0.02540    0.01998  -1.271   0.2037
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > # Pooled and clustered
>> > library(sabreR)
>> > attach(resp)
>> > m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b + age,
>> case=id)
>> > print(m_sabreR)
>> # ... deleted some output
>> (Standard Homogenous Model)
>>    Parameter              Estimate         Std. Err.        Z-score
>>    ____________________________________________________________________
>>    (intercept)           -0.90017          0.33765          -2.6660
>>    centr.b                0.67160          0.23957           2.8034
>>    treat.b                 1.2992          0.23684           5.4856
>>    sex.b                  0.11924          0.29467          0.40467
>>    pre.b                   1.8820          0.24129           7.7999
>>    age                   -0.18166E-01      0.88644E-02      -2.0493
>>
>>
>>
>> (Random Effects Model)
>>    Parameter              Estimate         Std. Err.        Z-score
>>    ____________________________________________________________________
>>    (intercept)            -1.6642          0.84652          -1.9660
>>    centr.b                0.99044          0.56561           1.7511
>>    treat.b                 2.1265          0.57198           3.7177
>>    sex.b                  0.18166          0.70814          0.25653
>>    pre.b                   2.9987          0.60174           4.9834
>>    age                   -0.22949E-01      0.21337E-01      -1.0755
>>    scale                   1.9955          0.32093           6.2180
>>
>> # ... deleted some output
>> > detach()
>>
>> On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com> wrote:
>> > On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
>> >> One problem appears to be that 111 id's are renumbered from 1 to 55
>> >> (56) in the two groups.
>> >> Unfortunately, it also appears that there is no unique mapping to
>> >> treatment groups. So there are some subjects with 8 values assigned to
>> >> one of the groups.
>> >
>> >  Thanks.  It looks like IDs are nested within center (not within
>> > treatment).  That doesn't seem to change the story very much (as far as
>> > , though (Zhang et al don't report estimated random-effect variances ...)
>> >
>> >
>> >
>> >>> library(geepack)
>> >>> data(respiratory)
>> >>> resp1 <- respiratory
>> >>> resp1 <- transform(resp1,
>> >> +                          center=factor(center),
>> >> +                          id=factor(id))
>> >>>
>> >>> str(resp1)
>> >> 'data.frame': 444 obs. of  8 variables:
>> >>  $ center  : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> >>  $ id      : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3
>> 3 ...
>> >>  $ treat   : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ...
>> >>  $ sex     : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
>> >>  $ age     : int  46 46 46 46 28 28 28 28 23 23 ...
>> >>  $ baseline: int  0 0 0 0 0 0 0 0 1 1 ...
>> >>  $ visit   : int  1 2 3 4 1 2 3 4 1 2 ...
>> >>  $ outcome : int  0 0 0 0 0 0 0 0 1 1 ...
>> >>> detach("package:geepack") ## allow detaching of doBy
>> >>> detach("package:doBy")    ## allow detaching of lme4
>> >>
>> >> The data appear also in the HSAUR package, here the 111 subjects
>> >> identified with 5 months (visits) each.  I suspect month 0 was used as
>> >> baseline.
>> >>> library(HSAUR)
>> >>> data(respiratory)
>> >>> resp2 <- respiratory
>> >>>
>> >>> str(resp2)
>> >> 'data.frame': 555 obs. of  7 variables:
>> >>  $ centre   : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>> >>  $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1 1 1 1
>> 1 1 ...
>> >>  $ sex      : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1
>> ...
>> >>  $ age      : num  46 46 46 46 46 28 28 28 28 28 ...
>> >>  $ status   : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1 1 ...
>> >>  $ month    : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4 5 1 2
>> 3 4 5 ...
>> >>  $ subject  : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2
>> 2 2 ...
>> >>
>> >> Reinhold
>> >>
>> >> On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com> wrote:
>> >>>
>> >>>  There's a fairly recent paper by Zhang et al (2011) of interest to
>> >>> folks on this list
>> >>>
>> >>> DOI: 10.1002/sim.4265
>> >>>
>> >>>  In response to a post on the AD Model Builder users' list, I took a
>> >>> quick shot at re-doing some of their results (they have extensive
>> >>> simulation results, which I haven't tried to replicate yet, and an
>> >>> analysis of binary data from Davis (1991) which is included (I *think*
>> >>> it's the same data set -- the description and size of the data set
>> match
>> >>> exactly) in the geepack data set).
>> >>>
>> >>>  If anyone's interested, my results so far are posted at
>> >>>
>> >>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw
>> >>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf
>> >>>
>> >>>  So far the R approaches I've tried agree closely with each other and
>> >>> with glmmADMB (except MASS::glmmPQL, which I expected to be different
>> --
>> >>> the rest all use either Laplace approx. or AGHQ).  They *don't* agree
>> >>> with the results Zhang et al got, yet -- I'm sure there's something I'm
>> >>> missing in the contrasts or otherwise ...
>> >>>
>> >>>  Suggestions or improvements are welcome.
>> >>>
>> >>>  cheers
>> >>>    Ben Bolker
>> >>>
>> >>> _______________________________________________
>> >>> R-sig-mixed-models at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >>>
>> >
>> >
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>




More information about the R-sig-mixed-models mailing list