[R-sig-ME] Mixed-model-binary logistic model with dependence between individual repeated measures
David Duffy
davidD at qimr.edu.au
Sat Jan 8 00:19:19 CET 2011
On Fri, 7 Jan 2011, Martin Maechler wrote:
>>>>>> Ben Bolker <bbolker at gmail.com>
>>>>>> on Fri, 07 Jan 2011 11:49:31 -0500 writes:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
>
> > On 11-01-07 11:35 AM, Anna Ekman wrote:
> >> Ben Bolker, thank you for your suggestions.
> >>
> >> Yes, it is suprising that I in SAS and STATA have to assume
> >> independence between the measurements within an individual.
>
> > It's fundamentally a bit hard to specify correlation among individuals
> > in a non-normal model. One option is to go completely to the marginal
> > specification (which you said you don't want to do); probably the most
> > sensible statistical formulation is
>
> > (fixed effects) eta0 = X*beta
> > (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such
> > as AR(1) within individuals))
> > y ~ Bernoulli(eta1)
>
> Interesting... {I've been "taught" in the past that correlation
> specification for non-normal, i.e. GLME models,
> would not make sense / be possible,
> something you do not seem to support ...
> }
>
> Does the above mean {slight changes}
>
> (fixed effects) eta0 = X*beta
> (random effects) eta1 ~ MVN(0, Sigma=(something sensible such
> as AR(1) within individuals))
> (Y | X,eta1) ~ Bernoulli( logit(eta0 + eta1) )
With the probit link, such dichotomous and ordinal variable mixed models
have a long history in genetics and psychometrics. In the latter case,
factor analysis and path analysis of tetrachoric/polychoric correlations
is completely equivalent to the probit-normal, although GLS/WLS was often
used for computational reasons. We used to do all this in LISREL. For
the case of varying numbers of observations per individual (and other
irregular data types), you can use the "multiple groups" approach, where
you specify a covariance matrix of the right size for each pattern of
data, and constrain the correlations equal in the different groups.
Since the main interest is in the correlations between latent variables,
all hypotheses and estimates are usually framed at that "level" of the
model.
In the genetic situation, for example, we might estimate the heritability
of a dichotomous trait based on family data under a polygenic model as
being 1/2 the sibling tetrachoric correlation. Model criticism is done by
comparing predicted risk to different degrees of relations of an affected
individual, or set of affected relatives. Practically, this was used for
genetic counselling etc. In the current era of genome wide association
studies, a key question is the "missing heritability", ie amount of
familial aggregation of diseases unexplained by gene variants with
detectable effect: the case control studies have N=30000. Some of the
arguments hinge on what kind of link function is used in the theoretical
model.
Sorry, I couldn't resist ;)
--
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
More information about the R-sig-mixed-models
mailing list