[R-sig-ME] rare binary outcome, MCMCglmm, and priors
David Atkins
datkins at u.washington.edu
Thu Sep 9 19:11:07 CEST 2010
On 9/8/10 3:00 PM, Ned Dochtermann wrote:
> Hi Dave,
> I was wondering if you happened to have finished these analyses and whether
> the greater variance in the priors "worked". I coincidentally have a paper
> in review (well, waiting on the AE recommendation actually-which I typically
> assume means impending rejection!) which I now realize likely had a similar
> problem to what you encountered. Like you I had some ridiculously large
> odds-ratios and I only came across the Gelman et al. (2008, Annals) paper
> after submission. I doubt the reanalysis would change the inferences but I'd
> prefer to have the analysis done correctly.
Hi Ned--
Yes, and apologies for not getting back to the list earlier.
The quick response is yes, using MCMCglmm with tighter priors for the
fixed-effects (relative to the defaults) helped with extreme
coefficients and odds-ratios (output below).
Here's my summary (and Jarrod, please feel free to edit as appropriate..):
-- The default fixed-effects priors in MCMCglmm are MVN with mean vector
of zero and variance I*1e+10, which is huge and not placing any
functional constraints on the fixed-effects.
-- Following Jarrod's suggestion, I used the following prior:
prior2.1.1 = list(R = list(V = 1, fix = 1),
B = list(mu = c(rep(0,9)), V = diag(9)*(4.3+pi^2/3)),
G = list(G1 = list(V = 1, nu = 0.002, alpha.mu = 0, alpha.V = 1000)))
where the key difference is the variance of the fixed-effects prior B,
which will provide some constraint on the fixed-effects, though if we
actually look at this variance vs. the observed coefficients, I still
feel comfortable that this is not "dominating" the data.
-- Jarrod also referred to the Gelman et al. article, which is
definitely worthwhile reading related to separation and Bayesian priors.
However, Gelman et al. suggest a t prior and in a backchannel exchange
with Jarrod he suggested that: 1) his earlier suggestion was in the same
"spirit" as the Gelman approach, but that 2) actually implementing t
priors in MCMCglmm would be non-trivial. Jarrod did have the following
suggestion, which I didn't explore but I pass along here in case someone
wants to explore:
"One possibility (and I may be wrong so I wouldn't waste too much time
following it up) is to fit the fixed effects as random effects with
common variance. You can do this by passing ~idv(fixed formula) to the
random part of the model. Lets say the prior for the fixed effects in
your current model has null mean vector and covariance matrix I*v. In
the model suggested above you can fix the variance associated with the
idv term to v, and this would be identical to treating them as fixed
with prior I*v as in the first model. Now, if you don't fix the variance
associated with the idv term, but give it an inverse-Wishart prior, this
will induce a t-prior on the effects. It should be possible to choose a
prior for the variance associated with the idv term which results in a
posterior distribution that conforms to the t-distribution suggested by
Gelman. [snip] My gut feeling however is that it could be made to work,
but I'm not sure whether you would get a very different answer."
A few other reflections:
-- Using slice = TRUE *definitely* helped convergence in this case.
-- I did end up using the parameter expansion (alpha.mu, alpha.V) for
the random-effect, though it was less clear whether this had notable
influence.
Below are outputs from MCMCglmm (nitt = 500000, burnin = 250000, thin =
250) which multiple chains and diagnostics suggest are reasonable
(though definitely still autocorrelation for some parameters), as well
as comparable output from glmer. Key differences are size of
random-effect variance and the smaller (and more reasonable)
coefficients from the MCMCglmm run.
At the end of the day, I think the underlying machinery of MCMCglmm was
quite helpful, though the data still are very rare, and so we need to be
cautious in interpretation, etc.
Hope that helps.
cheers, Dave
> summary(lr.mcmc1f)
Iterations = 499751
Thinning interval = 250001
Sample size = 1000
DIC: 346.7383
G-structure: ~person
post.mean l-95% CI u-95% CI eff.samp
person 3.704 1.228 6.35 81.95
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 1 1 1 0
Location effects: ipv ~ ang5.c + alc.c * ang.c * prov.c
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -8.58321 -9.60202 -7.57860 46.07 <0.001 ***
ang5.c 1.60969 0.18604 3.10334 674.87 0.032 *
alc.c 1.83108 -0.07322 3.63476 541.45 0.072 .
ang.c 1.63127 0.66607 2.62645 232.09 0.002 **
prov.c 2.08320 1.43703 2.64396 313.69 <0.001 ***
alc.c:ang.c -2.69393 -5.37519 -0.06124 1000.00 0.036 *
alc.c:prov.c -0.73255 -2.22607 0.79766 743.17 0.348
ang.c:prov.c -0.68224 -1.19169 -0.11578 229.32 0.008 **
alc.c:ang.c:prov.c 1.68279 0.03122 3.56480 1175.32 0.044 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> print(glmer.comp, cor = FALSE)
Generalized linear mixed model fit by the Laplace approximation
Formula: ipv ~ ang5.c + alc.c * ang.c * prov.c + (1 | person)
Data: ipv.df
AIC BIC logLik deviance
387.9 458.1 -183.9 367.9
Random effects:
Groups Name Variance Std.Dev.
person (Intercept) 6.6537 2.5795
Number of obs: 8320, groups: person, 183
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -8.8112 0.6133 -14.366 < 2e-16 ***
ang5.c 1.7139 0.7781 2.203 0.02761 *
alc.c 3.3495 1.2264 2.731 0.00631 **
ang.c 1.5704 0.6068 2.588 0.00966 **
prov.c 2.0618 0.3630 5.680 1.35e-08 ***
alc.c:ang.c -4.8751 1.9567 -2.492 0.01272 *
alc.c:prov.c -1.3401 0.9584 -1.398 0.16204
ang.c:prov.c -0.7178 0.2997 -2.395 0.01661 *
alc.c:ang.c:prov.c 2.6829 1.1900 2.255 0.02416 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 splines stats graphics grDevices utils datasets
methods
[9] base
other attached packages:
[1] MCMCglmm_2.06 corpcor_1.5.7 ape_2.5-3
coda_0.13-5
[5] tensorA_0.35 lme4_0.999375-35 Matrix_0.999375-44
lattice_0.19-11
[9] modeltools_0.2-16 mvtnorm_0.9-93 survival_2.36-1
loaded via a namespace (and not attached):
[1] coin_1.0-16 colorspace_1.0-1 gee_4.13-15 grid_2.11.1
nlme_3.1-96
[6] party_0.9-9998 rpart_3.1-46 tools_2.11.1
>
> Thanks a lot,
> Ned
>
> --
> Ned Dochtermann
> Department of Biology
> University of Nevada, Reno
>
> ned.dochtermann at gmail.com
> http://wolfweb.unr.edu/homepage/mpeacock/Dochter/
> --
More information about the R-sig-mixed-models
mailing list