[R-sig-ME] rare binary outcome, MCMCglmm, and priors (related to separation)

David Duffy davidD at qimr.edu.au
Mon Aug 30 23:58:23 CEST 2010


On Mon, 30 Aug 2010, David Atkins wrote:

>
> Some colleagues have collected data from 184 females in dating relationships. 
> Data were collected daily using PDAs; the outcome is a binary indicator of 
> whether any physical aggression occurred (intimate partner violence, or IPV).
>
> They are interested in 3 covariates:
>
> -- alcohol use: yes/no
> -- anger: rated on 1-5 scale
> -- verbal aggression: sum of handful of items, with 0-15 scale
>
> Their hypothesis is that the interaction of all 3 covariates will lead to the 
> highest likelihood of IPV.  As you might expect, the outcome is very rare 
> with 51 instances of IPV out of 8,269 days of data, and 158 women (out of 
> 184) reported no instances of IPV.
>
> I have read a bit about the problems of separation in logistic regression and 
> know that Gelman et al suggest Bayesian priors as one "solution".  Moreover, 
> I see in Jarrod Hadfield's course notes that his multinomial example has a 
> "structural" zero that he addresses via priors on pp. 96-97, though I confess 
> I don't quite follow exactly what he has done (and why).
>

Hi. why are you using a mixed model here: dispersion, or are there 
multiple reports per individual?  Another approach for separated/sparse 
data implemented in R is the penalized likelihood approach in the brlr, 
logistf, brglm (and Design) packages:

brglm(formula = cbind(ipv.yes, ipv.no) ~ (ang.cut + prov.cut +
     alc.cut)^2, family = binomial(), data = ipv)

Coefficients: (1 not defined because of singularities)
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -8.9666     1.4145  -6.339 2.31e-10 ***
ang.cut            2.8959     1.4775   1.960  0.05000 .
prov.cut           2.3740     0.4587   5.175 2.27e-07 ***
alc.cut            7.8680     2.7082   2.905  0.00367 **
ang.cut:prov.cut       NA         NA      NA       NA
ang.cut:alc.cut   -7.0703     2.8616  -2.471  0.01348 *
prov.cut:alc.cut  -0.4007     0.9962  -0.402  0.68747

Model 1: cbind(ipv.yes, ipv.no) ~ (ang.cut + prov.cut + alc.cut)
Model 2: cbind(ipv.yes, ipv.no) ~ (ang.cut + prov.cut + alc.cut)^2
   Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1         2     1.0875
2         0     1.8387  2 -0.75117

Cheers, David Duffy.
-- 
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v




More information about the R-sig-mixed-models mailing list