[R-sig-ME] Zhang 2011 (re)analysis
Ben Bolker
bbolker at gmail.com
Tue Nov 1 14:05:54 CET 2011
On 11-11-01 06:21 AM, Reinhold Kliegl wrote:
> I compared the respiratory data with the original publication; Davis,
> C.S. (1991, Statistics in Medicine, 10, 1959-1980). From this it
> appears that
> data("respiratory", package = "geepack")
> contains exactly the data reported in Appendix I of Davis (1991).
>
> Note that for work with lme4, it is recommended that unique id numbers
> are assigned to subjects of the second center, e.g.:
> respiratory$id <- ifelse(respiratory$center==1, respiratory$id,
> respiratory$id+56)
Another way to do this is to make a new "id nested within center"
variable, i.e. idctr <- interaction(id,center) ...
> The second version of these data:
> data("respiratory", package = "HSAUR")
> differs from the original ones in two respects:
> (1) male and female labels are exchanged
> (2) for record 428, the outcome/status should be "poor" (0), not "good" (1)
>
> So now we should be ready to replicate Zhang et al. (2011) :)
thanks!
Ben Bolker
>
> Reinhold
>
> PS: The page numbers for Zhang et al. (2011) are 2562-2572.
> PPS: The Everitt & Hothorn-Book was published 2006 (not 2002, as I
> wrote earlier); the HSAUR package goes with this book.
>
> On Mon, Oct 31, 2011 at 6:44 AM, Ben Bolker <bbolker at gmail.com> wrote:
>> On 11-10-31 07:49 PM, Saang-Yoon Hyun wrote:
>>> Hi,
>>> Could you show the reference: e.g., authors, year, tilte, journal,
>>> pages, etc.? If you could share its PDF version of the paper, it would
>>> be more helpful.
>>> Thank you,
>>> Saang-Yoon
>>
>> DOI: 10.1002/sim.4265
>> On fitting generalized linear mixed-effects models for binary responses
>> using different statistical packages
>> Hui Zhang, Naiji Lu, Changyong Feng, Sally W. Thurston,
>> Yinglin Xia, Liang Zhua and Xin M. Tu
>>
>> Statistics in Medicine. Apparently no page numbers (online publication?)
>>
>>>
>>> On Mon, Oct 31, 2011 at 12:15 PM, Reinhold Kliegl
>>> <reinhold.kliegl at gmail.com <mailto:reinhold.kliegl at gmail.com>> wrote:
>>>
>>> Here is some comparison between glm, glmer (using lme4 with Laplace)
>>> and sabreR (which uses AGHQ, if I recall correctly).
>>> The glm analysis replicates exactly the results reported in Everitt
>>> and Hothorn (2002, Table 13.1, around page 175). Thus, I am pretty
>>> sure I am using the correct data.
>>>
>>> sabreR gives estimates both for the standard homogeneous model,
>>> replicating the glm(), as well as a random effects model, replicating
>>> glmer() pretty closely, I think
>>>
>>> Reinhold
>>>
>>>
>>> > # HSAUR contains respiratory data
>>> > data("respiratory", package = "HSAUR")
>>> >
>>> > # Convert pretest to covariate "baseline"
>>> > resp <- subset(respiratory, month > "0" )
>>> > resp$baseline <- rep(subset(respiratory, month == "0")$status, each=4)
>>> >
>>> > # Numeric variants of factors
>>> > resp$centr.b <- as.integer(resp$centre) - 1
>>> > resp$treat.b <- as.integer(resp$treatment) - 1
>>> > resp$sex.b <- as.integer(resp$sex)-1
>>> > resp$pre.b <- as.integer(resp$baseline) - 1
>>> > resp$status.b <- as.integer(resp$status) - 1
>>> > resp$id <- as.integer(resp$subject)
>>> >
>>> > # Pooled estimate
>>> > summary(m_glm.b <-glm(status.b ~ centr.b + treat.b + sex.b + pre.b
>>> + age, family="binomial", data=resp))
>>>
>>> Call:
>>> glm(formula = status.b ~ centr.b + treat.b + sex.b + pre.b +
>>> age, family = "binomial", data = resp)
>>>
>>> Deviance Residuals:
>>> Min 1Q Median 3Q Max
>>> -2.3146 -0.8551 0.4336 0.8953 1.9246
>>>
>>> Coefficients:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept) -0.900171 0.337653 -2.666 0.00768 **
>>> centr.b 0.671601 0.239567 2.803 0.00506 **
>>> treat.b 1.299216 0.236841 5.486 4.12e-08 ***
>>> sex.b 0.119244 0.294671 0.405 0.68572
>>> pre.b 1.882029 0.241290 7.800 6.20e-15 ***
>>> age -0.018166 0.008864 -2.049 0.04043 *
>>> ---
>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>
>>> (Dispersion parameter for binomial family taken to be 1)
>>>
>>> Null deviance: 608.93 on 443 degrees of freedom
>>> Residual deviance: 483.22 on 438 degrees of freedom
>>> AIC: 495.22
>>>
>>> Number of Fisher Scoring iterations: 4
>>>
>>> >
>>> > # glmer
>>> > library(lme4)
>>> > print(m_glmer_4.L <- glmer(status ~ centre + treatment + sex +
>>> baseline + age + (1|subject),
>>> + family=binomial,data=resp), cor=FALSE)
>>> Generalized linear mixed model fit by the Laplace approximation
>>> Formula: status ~ centre + treatment + sex + baseline + age + (1 |
>>> subject)
>>> Data: resp
>>> AIC BIC logLik deviance
>>> 443 471.7 -214.5 429
>>> Random effects:
>>> Groups Name Variance Std.Dev.
>>> subject (Intercept) 3.8647 1.9659 <tel:3.8647%20%20%201.9659>
>>> Number of obs: 444, groups: subject, 111
>>>
>>> Fixed effects:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept) -1.64438 0.75829 -2.169 0.0301 *
>>> centre2 1.04382 0.53193 1.962 0.0497 *
>>> treatmenttreatment 2.15746 0.51757 4.168 3.07e-05 ***
>>> sexmale 0.20194 0.66117 0.305 0.7600
>>> baselinegood 3.06990 0.52608 5.835 5.37e-09 ***
>>> age -0.02540 0.01998 -1.271 0.2037
>>> ---
>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>> >
>>> > # Pooled and clustered
>>> > library(sabreR)
>>> > attach(resp)
>>> > m_sabreR <- sabre(status.b ~ centr.b + treat.b + sex.b + pre.b +
>>> age, case=id)
>>> > print(m_sabreR)
>>> # ... deleted some output
>>> (Standard Homogenous Model)
>>> Parameter Estimate Std. Err. Z-score
>>> ____________________________________________________________________
>>> (intercept) -0.90017 0.33765 -2.6660
>>> centr.b 0.67160 0.23957 2.8034
>>> treat.b 1.2992 0.23684 5.4856
>>> sex.b 0.11924 0.29467 0.40467
>>> pre.b 1.8820 0.24129 7.7999
>>> age -0.18166E-01 0.88644E-02 -2.0493
>>>
>>>
>>>
>>> (Random Effects Model)
>>> Parameter Estimate Std. Err. Z-score
>>> ____________________________________________________________________
>>> (intercept) -1.6642 0.84652 -1.9660
>>> centr.b 0.99044 0.56561 1.7511
>>> treat.b 2.1265 0.57198 3.7177
>>> sex.b 0.18166 0.70814 0.25653
>>> pre.b 2.9987 0.60174 4.9834
>>> age -0.22949E-01 0.21337E-01 -1.0755
>>> scale 1.9955 0.32093 6.2180
>>>
>>> # ... deleted some output
>>> > detach()
>>>
>>> On Mon, Oct 31, 2011 at 2:10 AM, Ben Bolker <bbolker at gmail.com
>>> <mailto:bbolker at gmail.com>> wrote:
>>> > On 11-10-31 05:22 AM, Reinhold Kliegl wrote:
>>> >> One problem appears to be that 111 id's are renumbered from 1 to 55
>>> >> (56) in the two groups.
>>> >> Unfortunately, it also appears that there is no unique mapping to
>>> >> treatment groups. So there are some subjects with 8 values
>>> assigned to
>>> >> one of the groups.
>>> >
>>> > Thanks. It looks like IDs are nested within center (not within
>>> > treatment). That doesn't seem to change the story very much (as
>>> far as
>>> > , though (Zhang et al don't report estimated random-effect
>>> variances ...)
>>> >
>>> >
>>> >
>>> >>> library(geepack)
>>> >>> data(respiratory)
>>> >>> resp1 <- respiratory
>>> >>> resp1 <- transform(resp1,
>>> >> + center=factor(center),
>>> >> + id=factor(id))
>>> >>>
>>> >>> str(resp1)
>>> >> 'data.frame': 444 obs. of 8 variables:
>>> >> $ center : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>>> >> $ id : Factor w/ 56 levels "1","2","3","4",..: 1 1 1 1 2 2
>>> 2 2 3 3 ...
>>> >> $ treat : Factor w/ 2 levels "A","P": 2 2 2 2 2 2 2 2 1 1 ...
>>> >> $ sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
>>> >> $ age : int 46 46 46 46 28 28 28 28 23 23 ...
>>> >> $ baseline: int 0 0 0 0 0 0 0 0 1 1 ...
>>> >> $ visit : int 1 2 3 4 1 2 3 4 1 2 ...
>>> >> $ outcome : int 0 0 0 0 0 0 0 0 1 1 ...
>>> >>> detach("package:geepack") ## allow detaching of doBy
>>> >>> detach("package:doBy") ## allow detaching of lme4
>>> >>
>>> >> The data appear also in the HSAUR package, here the 111 subjects
>>> >> identified with 5 months (visits) each. I suspect month 0 was
>>> used as
>>> >> baseline.
>>> >>> library(HSAUR)
>>> >>> data(respiratory)
>>> >>> resp2 <- respiratory
>>> >>>
>>> >>> str(resp2)
>>> >> 'data.frame': 555 obs. of 7 variables:
>>> >> $ centre : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>>> >> $ treatment: Factor w/ 2 levels "placebo","treatment": 1 1 1 1 1
>>> 1 1 1 1 1 ...
>>> >> $ sex : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1
>>> 1 1 ...
>>> >> $ age : num 46 46 46 46 46 28 28 28 28 28 ...
>>> >> $ status : Factor w/ 2 levels "poor","good": 1 1 1 1 1 1 1 1 1
>>> 1 ...
>>> >> $ month : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 1 2 3 4
>>> 5 1 2 3 4 5 ...
>>> >> $ subject : Factor w/ 111 levels "1","2","3","4",..: 1 1 1 1 1
>>> 2 2 2 2 2 ...
>>> >>
>>> >> Reinhold
>>> >>
>>> >> On Sun, Oct 30, 2011 at 10:00 PM, Ben Bolker <bbolker at gmail.com
>>> <mailto:bbolker at gmail.com>> wrote:
>>> >>>
>>> >>> There's a fairly recent paper by Zhang et al (2011) of interest to
>>> >>> folks on this list
>>> >>>
>>> >>> DOI: 10.1002/sim.4265
>>> >>>
>>> >>> In response to a post on the AD Model Builder users' list, I took a
>>> >>> quick shot at re-doing some of their results (they have extensive
>>> >>> simulation results, which I haven't tried to replicate yet, and an
>>> >>> analysis of binary data from Davis (1991) which is included (I
>>> *think*
>>> >>> it's the same data set -- the description and size of the data
>>> set match
>>> >>> exactly) in the geepack data set).
>>> >>>
>>> >>> If anyone's interested, my results so far are posted at
>>> >>>
>>> >>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.Rnw
>>> >>> http://glmm.wikidot.com/local--files/examples/Zhang_reanalysis.pdf
>>> >>>
>>> >>> So far the R approaches I've tried agree closely with each
>>> other and
>>> >>> with glmmADMB (except MASS::glmmPQL, which I expected to be
>>> different --
>>> >>> the rest all use either Laplace approx. or AGHQ). They *don't*
>>> agree
>>> >>> with the results Zhang et al got, yet -- I'm sure there's
>>> something I'm
>>> >>> missing in the contrasts or otherwise ...
>>> >>>
>>> >>> Suggestions or improvements are welcome.
>>> >>>
>>> >>> cheers
>>> >>> Ben Bolker
>>> >>>
>>> >>> _______________________________________________
>>> >>> R-sig-mixed-models at r-project.org
>>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>>> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> >>>
>>> >
>>> >
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org
>>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>
>>
More information about the R-sig-mixed-models
mailing list