[R-sig-ME] Precision about the glmer model for Bernoulli variables

Vaida, Florin |v@|d@ @end|ng |rom he@|th@uc@d@edu
Mon Apr 20 23:05:40 CEST 2020


Hi Emmanuel,

That's a good question.  My guess is that the correlation is non-negative generally, but I wasn't able to prove it theoretically even in the simplest case when Y1, Y2 ~ Bernoulli(u) independent conditionally on u, and u ~ Normal(0, 1).  I am curious if someone has a solution.
We can't go too far down this route in this forum, since Doug wants to keep it applied.

Florin

> On Apr 20, 2020, at 12:32 PM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr> wrote:
> 
> Hi Florin,
> 
> Thanks for the answer, the precision about p(i,j), and the reference.
> 
> A last question, that I forgot in my message: is the obtained
> correlation also always positive, as in the linear case? Or may some
> negative correlation appear, depending on the values of pi(i,j) and
> pi(i,j')?
> 
> Best regards,
> 
> On Mon, Apr 20, 2020 at 03:27:39PM +0000, Vaida, Florin wrote:
> « Hi Emmanuel,
> « 
> « Your reasoning is correct.
> « 
> « As a quibble, outside a simple repeated measures experiment setup, p(i,j) *does* depend on j.
> « For example, if observations are collected over time, generally there is a time effect; if they are repeated measures with different experimental conditions, p(i,j) will depend on the condition j, etc.
> « 
> « There is almost certainly no closed form solution for the covariance under logit.
> « I am not sure about the probit (my guess is not).
> « There will be some Laplace approximations available, a la Breslow and Clayton 1993.
> « 
> « I'd be curious if these formulas/approximations were developed somewhere - I'd be surprised if they weren't.
> « 
> « Florin
> « 
> « 
> « > On Apr 20, 2020, at 12:48 AM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr> wrote:
> « > 
> « > Hello everyone,
> « > 
> « > I hope you're all going fine in these difficult times.
> « > 
> « > I tried to understand in details the exact model used when using glmer
> « > for a Bernoulli experiment, by comparison with the linear mixed
> « > effects model, and especially how it introducts correlations between
> « > observations of a given group.  I think I finally got it, but could
> « > you check that what I write below is correct and that I'm not missing
> « > something?
> « > 
> « > I use a very simple case with only a single random effect, and no
> « > fixed effects, because I guess that adding fixed effects or other
> « > random effects does not change the idea, it "just" makes formulas more
> « > complex.  I note i the random effect level, let's say « patient », and
> « > j the observation for this patient.
> « > 
> « > In the linear model, we have Y(i,j) = µ0 + Z(i) + epsilon( i, j ) with
> « > Z(i) and epsilon(i,j) randoms variables having a density of
> « > probability, independant, and each iid.
> « > 
> « > Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a
> « > positive correlation between observations of the same patient.
> « > 
> « > 
> « > 
> « > In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ),
> « > f being the inverse link function, typically the reciprocal of the
> « > logit. So we have
> « > 
> « > cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') )
> « >     = Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j')
> « > 
> « > Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j').
> « > 
> « > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
> « >  integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz
> « > 
> « > Then, we assume that conditionnally on Zi, the Yij are independant, is
> « > this right? This is the equivalent of « the epsilon(i, j) are
> « > independant »? I assume this hypothesis is also used for computing the
> « > likelihood? If not, what is the model for the joint probability?
> « > 
> « > In that case,
> « > 
> « > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
> « >  integral(R) f(z) f(z) p( Z(i) = z ) dz
> « > 
> « > and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have
> « > 
> « > cov( Y(i,j), Y(i,j') ) =
> « > integral( R ) f²(z) p( Z(i) = z ) dz -
> « >  ( integral( R ) f(z) p( Z(i) = z ) dz )²
> « > 
> « > which in general has no reason to be nul, hence the two observations
> « > are correlated. Is this correct?
> « > 
> « > Is there any way to have a closed-form of the covariance, for usual f
> « > (let's say, logit or probit) and Z distribution (let's say, Gaussian)?
> « > 
> « > Thanks a lot for reading, and your answers,
> « > 
> « > -- 
> « >                                Emmanuel CURIS
> « >                                emmanuel.curis using parisdescartes.fr
> « > 
> « > Page WWW: http://emmanuel.curis.online.fr/index.html
> « > 
> « > _______________________________________________
> « > R-sig-mixed-models using r-project.org mailing list
> « > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> « 
> 
> -- 
>                                Emmanuel CURIS
>                                emmanuel.curis using parisdescartes.fr
> 
> Page WWW: http://emmanuel.curis.online.fr/index.html



More information about the R-sig-mixed-models mailing list