[R-sig-ME] Precision about the glmer model for Bernoulli variables
Vaida, Florin
|v@|d@ @end|ng |rom he@|th@uc@d@edu
Tue Apr 21 21:09:47 CEST 2020
Hi John,
Thanks for the Prentice reference. However, I just want to point out that the extended beta-binomial is proposed there as a marginal distribution for the observations. It could not arise from a mixed-effects model specification.
This is exactly analogous to the situation in the linear/normal case: A linear mixed-effects random intercept model reduces to a general linear model with constant, positive within-subject correlation. Vice-versa, one can start with a marginal specification with constant within-subject correlation; this is possible even with negative correlations (within a certain range), but negative correlations do not correspond to a mixed-effects model specification. (I thought Geert Molenberghs had a paper to this point but I can't find it now.)
Florin
On Apr 21, 2020, at 1:15 AM, John Maindonald <john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>> wrote:
Apologies, the lower limit for the BB variance is pi(1-pi)rho.
Actually, with a suitable parameterization, the betabinomial allows
small negative correlations. See
Binary Regression Using an Extended Beta-Binomial Distribution, With Discussion of
Correlation Induced by Covariate Measurement Errors
R. L. Prentice: Journal of the American Statistical Association
Vol. 81, No. 394 (Jun., 1986), pp. 321-327
A feature of the betabinomial (BB) , which ought to be better advertised than
it is in most accounts that I have seen, is that, for positive correlation rho,
it sets a strict lower limit of pi(1-pi)rho on the variance of the estimate
of the proportion pi. This is in marked contrast to the variance
assumptions that define quasibinomial errors.
The gamlss package implements the double binomial (as well
as the logistic normal and BB), but as far as I can tell, not in a
multi-level model context. The double binomial does allow a
negative correlation.
John Maindonald email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>
On 21/04/2020, at 19:18, David Duffy <David.Duffy using qimrberghofer.edu.au<mailto:David.Duffy using qimrberghofer.edu.au>> wrote:
You'd know the beta-binomial constrains the range of the correlation coefficient positive, but the logit-normal and probit-normal allow a negative correlation, with a lower bound constrained by the N, in the case of regular sized clusters. For your example of 2x2 tables, the correlation can go from -1 to 1. This is the usual genetics
type setup where you model the pairwise correlations for all pairs of observations in the dataset versus their
(pairwise) genetic relatedness. In multi-trait models, the correlations between different phenotypes (say asthma and diabetes) in the same individual can be negative, so you can estimate negative variance components.
The pedigreemm package extends lme4 to allow glmms for pedigree data (see the mastitis example for a binary trait).
Cheers, David Duffy.
________________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org<mailto:r-sig-mixed-models-bounces using r-project.org>> on behalf of Vaida, Florin <fvaida using health.ucsd.edu<mailto:fvaida using health.ucsd.edu>>
Sent: Tuesday, 21 April 2020 7:05:40 AM
To: Emmanuel Curis
Cc: r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Precision about the glmer model for Bernoulli variables
Hi Emmanuel,
That's a good question. My guess is that the correlation is non-negative generally, but I wasn't able to prove it theoretically even in the simplest case when Y1, Y2 ~ Bernoulli(u) independent conditionally on u, and u ~ Normal(0, 1). I am curious if someone has a solution.
We can't go too far down this route in this forum, since Doug wants to keep it applied.
Florin
On Apr 20, 2020, at 12:32 PM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>> wrote:
Hi Florin,
Thanks for the answer, the precision about p(i,j), and the reference.
A last question, that I forgot in my message: is the obtained
correlation also always positive, as in the linear case? Or may some
negative correlation appear, depending on the values of pi(i,j) and
pi(i,j')?
Best regards,
On Mon, Apr 20, 2020 at 03:27:39PM +0000, Vaida, Florin wrote:
« Hi Emmanuel,
«
« Your reasoning is correct.
«
« As a quibble, outside a simple repeated measures experiment setup, p(i,j) *does* depend on j.
« For example, if observations are collected over time, generally there is a time effect; if they are repeated measures with different experimental conditions, p(i,j) will depend on the condition j, etc.
«
« There is almost certainly no closed form solution for the covariance under logit.
« I am not sure about the probit (my guess is not).
« There will be some Laplace approximations available, a la Breslow and Clayton 1993.
«
« I'd be curious if these formulas/approximations were developed somewhere - I'd be surprised if they weren't.
«
« Florin
«
«
« > On Apr 20, 2020, at 12:48 AM, Emmanuel Curis <emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>> wrote:
« >
« > Hello everyone,
« >
« > I hope you're all going fine in these difficult times.
« >
« > I tried to understand in details the exact model used when using glmer
« > for a Bernoulli experiment, by comparison with the linear mixed
« > effects model, and especially how it introducts correlations between
« > observations of a given group. I think I finally got it, but could
« > you check that what I write below is correct and that I'm not missing
« > something?
« >
« > I use a very simple case with only a single random effect, and no
« > fixed effects, because I guess that adding fixed effects or other
« > random effects does not change the idea, it "just" makes formulas more
« > complex. I note i the random effect level, let's say « patient », and
« > j the observation for this patient.
« >
« > In the linear model, we have Y(i,j) = µ0 + Z(i) + epsilon( i, j ) with
« > Z(i) and epsilon(i,j) randoms variables having a density of
« > probability, independant, and each iid.
« >
« > Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a
« > positive correlation between observations of the same patient.
« >
« >
« >
« > In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ),
« > f being the inverse link function, typically the reciprocal of the
« > logit. So we have
« >
« > cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') )
« > = Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j')
« >
« > Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j').
« >
« > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
« > integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz
« >
« > Then, we assume that conditionnally on Zi, the Yij are independant, is
« > this right? This is the equivalent of « the epsilon(i, j) are
« > independant »? I assume this hypothesis is also used for computing the
« > likelihood? If not, what is the model for the joint probability?
« >
« > In that case,
« >
« > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) =
« > integral(R) f(z) f(z) p( Z(i) = z ) dz
« >
« > and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have
« >
« > cov( Y(i,j), Y(i,j') ) =
« > integral( R ) f²(z) p( Z(i) = z ) dz -
« > ( integral( R ) f(z) p( Z(i) = z ) dz )²
« >
« > which in general has no reason to be nul, hence the two observations
« > are correlated. Is this correct?
« >
« > Is there any way to have a closed-form of the covariance, for usual f
« > (let's say, logit or probit) and Z distribution (let's say, Gaussian)?
« >
« > Thanks a lot for reading, and your answers,
« >
« > --
« > Emmanuel CURIS
« > emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>
« >
« > Page WWW: http://emmanuel.curis.online.fr/index.html
« >
« > _______________________________________________
« > R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
« > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
«
--
Emmanuel CURIS
emmanuel.curis using parisdescartes.fr<mailto:emmanuel.curis using parisdescartes.fr>
Page WWW: http://emmanuel.curis.online.fr/index.html
_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[EXTERNAL EMAIL] This message originates from an external email address, please exercise caution when clicking any links or opening attachments. If you believe the sender is impersonating someone at QIMR Berghofer, please forward this message to phishing using qimrberghofer.edu.au<mailto:phishing using qimrberghofer.edu.au>.
_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list