[R-sig-ME] Precision about the glmer model for Bernoulli variables
john@m@|ndon@|d @end|ng |rom @nu@edu@@u
Tue Apr 21 09:59:48 CEST 2020
Actually, with a suitable parameterization, the betabinomial allows
small negative correlations. See
Binary Regression Using an Extended Beta-Binomial Distribution, With Discussion of
Correlation Induced by Covariate Measurement Errors
R. L. Prentice: Journal of the American Statistical Association
Vol. 81, No. 394 (Jun., 1986), pp. 321-327
A feature of the betabinomial (BB) , which ought to be better advertised than
it is in most accounts that I have seen, is that, for positive correlation rho,
it sets a strict lower limit of pi(1-pi)(1+rho) on the variance of the estimate
of the proportion pi. This is in marked contrast to the variance
assumptions that define quasibinomial errors.
The gamlss package implements the double binomial (as well
as the logistic normal and BB), but as far as I can tell, not in a
multi-level model context. The double binomial does allow a
John Maindonald email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>
On 21/04/2020, at 19:18, David Duffy <David.Duffy using qimrberghofer.edu.au<mailto:David.Duffy using qimrberghofer.edu.au>> wrote:
You'd know the beta-binomial constrains the range of the correlation coefficient positive, but the logit-normal and probit-normal allow a negative correlation, with a lower bound constrained by the N, in the case of regular sized clusters. For your example of 2x2 tables, the correlation can go from -1 to 1. This is the usual genetics
type setup where you model the pairwise correlations for all pairs of observations in the dataset versus their
(pairwise) genetic relatedness. In multi-trait models, the correlations between different phenotypes (say asthma and diabetes) in the same individual can be negative, so you can estimate negative variance components.
The pedigreemm package extends lme4 to allow glmms for pedigree data (see the mastitis example for a binary trait).
Cheers, David Duffy.
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org<mailto:r-sig-mixed-models-bounces using r-project.org>> on behalf of Vaida, Florin <fvaida using health.ucsd.edu<mailto:fvaida using health.ucsd.edu>>
Sent: Tuesday, 21 April 2020 7:05:40 AM
To: Emmanuel Curis
Cc: r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Precision about the glmer model for Bernoulli variables
That's a good question. My guess is that the correlation is non-negative generally, but I wasn't able to prove it theoretically even in the simplest case when Y1, Y2 ~ Bernoulli(u) independent conditionally on u, and u ~ Normal(0, 1). I am curious if someone has a solution.
We can't go too far down this route in this forum, since Doug wants to keep it applied.
On Apr 20, 2020, at 12:32 PM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>> wrote:
Thanks for the answer, the precision about p(i,j), and the reference.
A last question, that I forgot in my message: is the obtained
correlation also always positive, as in the linear case? Or may some
negative correlation appear, depending on the values of pi(i,j) and
On Mon, Apr 20, 2020 at 03:27:39PM +0000, Vaida, Florin wrote:
« Hi Emmanuel,
« Your reasoning is correct.
« As a quibble, outside a simple repeated measures experiment setup, p(i,j) *does* depend on j.
« For example, if observations are collected over time, generally there is a time effect; if they are repeated measures with different experimental conditions, p(i,j) will depend on the condition j, etc.
« There is almost certainly no closed form solution for the covariance under logit.
« I am not sure about the probit (my guess is not).
« There will be some Laplace approximations available, a la Breslow and Clayton 1993.
« I'd be curious if these formulas/approximations were developed somewhere - I'd be surprised if they weren't.
« > On Apr 20, 2020, at 12:48 AM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>> wrote:
« > Hello everyone,
« > I hope you're all going fine in these difficult times.
« > I tried to understand in details the exact model used when using glmer
« > for a Bernoulli experiment, by comparison with the linear mixed
« > effects model, and especially how it introducts correlations between
« > observations of a given group. I think I finally got it, but could
« > you check that what I write below is correct and that I'm not missing
« > something?
« > I use a very simple case with only a single random effect, and no
« > fixed effects, because I guess that adding fixed effects or other
« > random effects does not change the idea, it "just" makes formulas more
« > complex. I note i the random effect level, let's say « patient », and
« > j the observation for this patient.
« > In the linear model, we have Y(i,j) = µ0 + Z(i) + epsilon( i, j ) with
« > Z(i) and epsilon(i,j) randoms variables having a density of
« > probability, independant, and each iid.
« > Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a
« > positive correlation between observations of the same patient.
« > In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ),
« > f being the inverse link function, typically the reciprocal of the
« > logit. So we have
« > cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') )
« > = Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j')
« > Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j').
« > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
« > integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz
« > Then, we assume that conditionnally on Zi, the Yij are independant, is
« > this right? This is the equivalent of « the epsilon(i, j) are
« > independant »? I assume this hypothesis is also used for computing the
« > likelihood? If not, what is the model for the joint probability?
« > In that case,
« > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
« > integral(R) f(z) f(z) p( Z(i) = z ) dz
« > and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have
« > cov( Y(i,j), Y(i,j') ) =
« > integral( R ) f²(z) p( Z(i) = z ) dz -
« > ( integral( R ) f(z) p( Z(i) = z ) dz )²
« > which in general has no reason to be nul, hence the two observations
« > are correlated. Is this correct?
« > Is there any way to have a closed-form of the covariance, for usual f
« > (let's say, logit or probit) and Z distribution (let's say, Gaussian)?
« > Thanks a lot for reading, and your answers,
« > --
« > Emmanuel CURIS
« > emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>
« > Page WWW: http://emmanuel.curis.online.fr/index.html
« > _______________________________________________
« > R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
« > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>
Page WWW: http://emmanuel.curis.online.fr/index.html
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
[EXTERNAL EMAIL] This message originates from an external email address, please exercise caution when clicking any links or opening attachments. If you believe the sender is impersonating someone at QIMR Berghofer, please forward this message to phishing using qimrberghofer.edu.au.
R-sig-mixed-models using r-project.org mailing list
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models