[R-sig-ME] Precision about the glmer model for Bernoulli variables
Emmanuel Curis
emm@nue|@cur|@ @end|ng |rom p@r|@de@c@rte@@|r
Mon Apr 20 09:48:51 CEST 2020
Hello everyone,
I hope you're all going fine in these difficult times.
I tried to understand in details the exact model used when using glmer
for a Bernoulli experiment, by comparison with the linear mixed
effects model, and especially how it introducts correlations between
observations of a given group. I think I finally got it, but could
you check that what I write below is correct and that I'm not missing
something?
I use a very simple case with only a single random effect, and no
fixed effects, because I guess that adding fixed effects or other
random effects does not change the idea, it "just" makes formulas more
complex. I note i the random effect level, let's say « patient », and
j the observation for this patient.
In the linear model, we have Y(i,j) = µ0 + Z(i) + epsilon( i, j ) with
Z(i) and epsilon(i,j) randoms variables having a density of
probability, independant, and each iid.
Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a
positive correlation between observations of the same patient.
In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ),
f being the inverse link function, typically the reciprocal of the
logit. So we have
cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') )
= Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j')
Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j').
Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz
Then, we assume that conditionnally on Zi, the Yij are independant, is
this right? This is the equivalent of « the epsilon(i, j) are
independant »? I assume this hypothesis is also used for computing the
likelihood? If not, what is the model for the joint probability?
In that case,
Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
integral(R) f(z) f(z) p( Z(i) = z ) dz
and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have
cov( Y(i,j), Y(i,j') ) =
integral( R ) f²(z) p( Z(i) = z ) dz -
( integral( R ) f(z) p( Z(i) = z ) dz )²
which in general has no reason to be nul, hence the two observations
are correlated. Is this correct?
Is there any way to have a closed-form of the covariance, for usual f
(let's say, logit or probit) and Z distribution (let's say, Gaussian)?
Thanks a lot for reading, and your answers,
--
Emmanuel CURIS
emmanuel.curis using parisdescartes.fr
Page WWW: http://emmanuel.curis.online.fr/index.html
More information about the R-sig-mixed-models
mailing list