[R-sig-ME] Precision about the glmer model for Bernoulli variables
David Duffy
D@v|d@Du||y @end|ng |rom q|mrbergho|er@edu@@u
Tue Apr 21 09:18:41 CEST 2020
You'd know the beta-binomial constrains the range of the correlation coefficient positive, but the logit-normal and probit-normal allow a negative correlation, with a lower bound constrained by the N, in the case of regular sized clusters. For your example of 2x2 tables, the correlation can go from -1 to 1. This is the usual genetics
type setup where you model the pairwise correlations for all pairs of observations in the dataset versus their
(pairwise) genetic relatedness. In multi-trait models, the correlations between different phenotypes (say asthma and diabetes) in the same individual can be negative, so you can estimate negative variance components.
The pedigreemm package extends lme4 to allow glmms for pedigree data (see the mastitis example for a binary trait).
Cheers, David Duffy.
________________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> on behalf of Vaida, Florin <fvaida using health.ucsd.edu>
Sent: Tuesday, 21 April 2020 7:05:40 AM
To: Emmanuel Curis
Cc: r-sig-mixed-models using r-project.org
Subject: Re: [R-sig-ME] Precision about the glmer model for Bernoulli variables
Hi Emmanuel,
That's a good question. My guess is that the correlation is non-negative generally, but I wasn't able to prove it theoretically even in the simplest case when Y1, Y2 ~ Bernoulli(u) independent conditionally on u, and u ~ Normal(0, 1). I am curious if someone has a solution.
We can't go too far down this route in this forum, since Doug wants to keep it applied.
Florin
> On Apr 20, 2020, at 12:32 PM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr> wrote:
>
> Hi Florin,
>
> Thanks for the answer, the precision about p(i,j), and the reference.
>
> A last question, that I forgot in my message: is the obtained
> correlation also always positive, as in the linear case? Or may some
> negative correlation appear, depending on the values of pi(i,j) and
> pi(i,j')?
>
> Best regards,
>
> On Mon, Apr 20, 2020 at 03:27:39PM +0000, Vaida, Florin wrote:
> « Hi Emmanuel,
> «
> « Your reasoning is correct.
> «
> « As a quibble, outside a simple repeated measures experiment setup, p(i,j) *does* depend on j.
> « For example, if observations are collected over time, generally there is a time effect; if they are repeated measures with different experimental conditions, p(i,j) will depend on the condition j, etc.
> «
> « There is almost certainly no closed form solution for the covariance under logit.
> « I am not sure about the probit (my guess is not).
> « There will be some Laplace approximations available, a la Breslow and Clayton 1993.
> «
> « I'd be curious if these formulas/approximations were developed somewhere - I'd be surprised if they weren't.
> «
> « Florin
> «
> «
> « > On Apr 20, 2020, at 12:48 AM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr> wrote:
> « >
> « > Hello everyone,
> « >
> « > I hope you're all going fine in these difficult times.
> « >
> « > I tried to understand in details the exact model used when using glmer
> « > for a Bernoulli experiment, by comparison with the linear mixed
> « > effects model, and especially how it introducts correlations between
> « > observations of a given group. I think I finally got it, but could
> « > you check that what I write below is correct and that I'm not missing
> « > something?
> « >
> « > I use a very simple case with only a single random effect, and no
> « > fixed effects, because I guess that adding fixed effects or other
> « > random effects does not change the idea, it "just" makes formulas more
> « > complex. I note i the random effect level, let's say « patient », and
> « > j the observation for this patient.
> « >
> « > In the linear model, we have Y(i,j) = µ0 + Z(i) + epsilon( i, j ) with
> « > Z(i) and epsilon(i,j) randoms variables having a density of
> « > probability, independant, and each iid.
> « >
> « > Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a
> « > positive correlation between observations of the same patient.
> « >
> « >
> « >
> « > In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ),
> « > f being the inverse link function, typically the reciprocal of the
> « > logit. So we have
> « >
> « > cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') )
> « > = Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j')
> « >
> « > Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j').
> « >
> « > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
> « > integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz
> « >
> « > Then, we assume that conditionnally on Zi, the Yij are independant, is
> « > this right? This is the equivalent of « the epsilon(i, j) are
> « > independant »? I assume this hypothesis is also used for computing the
> « > likelihood? If not, what is the model for the joint probability?
> « >
> « > In that case,
> « >
> « > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
> « > integral(R) f(z) f(z) p( Z(i) = z ) dz
> « >
> « > and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have
> « >
> « > cov( Y(i,j), Y(i,j') ) =
> « > integral( R ) f²(z) p( Z(i) = z ) dz -
> « > ( integral( R ) f(z) p( Z(i) = z ) dz )²
> « >
> « > which in general has no reason to be nul, hence the two observations
> « > are correlated. Is this correct?
> « >
> « > Is there any way to have a closed-form of the covariance, for usual f
> « > (let's say, logit or probit) and Z distribution (let's say, Gaussian)?
> « >
> « > Thanks a lot for reading, and your answers,
> « >
> « > --
> « > Emmanuel CURIS
> « > emmanuel.curis using parisdescartes.fr
> « >
> « > Page WWW: http://emmanuel.curis.online.fr/index.html
> « >
> « > _______________________________________________
> « > R-sig-mixed-models using r-project.org mailing list
> « > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> «
>
> --
> Emmanuel CURIS
> emmanuel.curis using parisdescartes.fr
>
> Page WWW: http://emmanuel.curis.online.fr/index.html
_______________________________________________
R-sig-mixed-models using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[EXTERNAL EMAIL] This message originates from an external email address, please exercise caution when clicking any links or opening attachments. If you believe the sender is impersonating someone at QIMR Berghofer, please forward this message to phishing using qimrberghofer.edu.au.
More information about the R-sig-mixed-models
mailing list