[R-sig-ME] 4 binary DVs, subjects nested within schools

Wed Nov 23 07:32:03 CET 2011

  On 23/11/2011, at 4:25, Paul Johnson <pauljohn32 at gmail.com> wrote:

> The gist of this is that there are 4 dichotomous outputs that can be
> modeled separately with logistic or probit models, and lme4 works fine
> treating each one separately.  There is a random effect at the school
> level.
>

You _could_ look at openMX, which runs under R, but is not yet in CRAN
because of negotiations about the licensing for NPSOL, 
if I understand correctly.

http://openmx.psyc.virginia.edu/

> [...] and then I need to take into
> account the fact that students are nested in classrooms.

Mx would fit a multivariate probit mixed model.

If the schools are big enough, you might take the other older approach of
calculating tetrachoric correlation matrices and using a SEM package, such
as "sem".

> And why are multivariate approaches not making the same mistake that
> is described in this literature on comparison of coefficients across
> logit models fitted for separate groups. I mean, if the variance
> parameter is not identified, how can I meaningfully put together 4
> logit models?

The multivariate probit doesn't have that problem, because it has to
ignore it ;) If you think of the threshold formulation (as you usually
do once you have more than two ordinal categories), you get the
tetrachoric correlation as the measure of association between your
variables.  You can get easily get models where the correlation matrix
is the same for the different groups, even though the item endorsement
rates (or prevalences) for the items are different.  With only one
threshold, we handwave and say that the underlying latent variables are
the same, but the thresholds have moved.  With two or more thresholds,
changes in variance can appear as the thresholds moving closer together
or further apart.

One test of the appropriateness of the model is if you have
three or more DVs, then for any three, you can fit a one factor factor
analytic model to the tetrachoric correlation matrix, which should give a
perfect fit to the observed 2x2x2 contingency table (there is a paper by
Muthen) - this has low power.

But if you think about it, it is not that different to the question of
why we usually assume that random effects are normally distributed.

Anyway, I have rambled long enough.

-- 
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v