[R-sig-ME] Re. nested cross-sectional model using lme4 or nlme
Emmanuel Charpentier
charpent at bacbuc.dyndns.org
Tue Jun 30 18:32:39 CEST 2009
Dear List, dear Alan,
Le mardi 30 juin 2009 à 11:00 +0100, Alan Kelly a écrit :
> Emmanuel - very many thanks for your detailed reply.
> Some points of clarification and some results... and I should mention
> that I'm analysing these data on behalf of a colleague.
> Although the unit of randomization is the community, data are indeed
> available at the level of the individual (but the same individuals
> could not be guaranteed to be available at follow-up - which forces me
> to adopt a nested cross-sectional design rather than a nested cohort
> design) so your suggested model 2 is appropriate.
> "cond" and "timecat" and "unit" are factors.
> The imbalance I referred to arises from the variable numbers of
> individuals available per community at baseline and again at follow-up
> - only the number of communities (units) were fixed.
> You suggest that lme might be preferable to lmer in this case - point
> taken, however, the model failed to run under lme - giving the
> following message:
>
> reg=lme(y~cond*timecat, random= ~timecat | unit)
> Error in lme.formula(y ~ cond * timecat, random = ~timecat | unit) :
> nlminb problem, convergence error code = 1
> message = iteration limit reached without convergence (9)
Did you try to force a larger iteration limit (using
control=lme.control(someparameterIcantrememberjustnow =
somethingmorethanthedefault) ? ISTR that a casual remark (in P&B ? In
V&R4 ?) noted that this is often necessary...
> Whereas it did run successfully under lmer....
>
> reg=lmer(y~cond*timecat + (timecat|unit))
> > summary(reg)
> Linear mixed model fit by REML
> Formula: y ~ cond * timecat + (timecat | unit)
> AIC BIC logLik deviance REMLdev
> 8628 8670 -4306 8611 8612
> Random effects:
> Groups Name Variance Std.Dev. Corr
> unit (Intercept) 0.67586 0.82211
> timecat 0.79892 0.89382 -1.000
> Residual 34.92760 5.90996
> Number of obs: 1346, groups: unit, 37
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 12.0166 0.3118 38.53
> cond -0.2387 0.5241 -0.46
> timecat -1.0334 0.4498 -2.30
> cond:timecat 0.3727 0.7610 0.49
>
> Correlation of Fixed Effects:
> (Intr) cond timect
> cond -0.595
> timecat -0.711 0.423
> cond:timect 0.420 -0.708 -0.591
>
> For this outcome at least, the interaction term (essentially a
> difference-in-differences estimate) is non-significant.... a little
> disappointing, given the effort!
Shit happens ... Did you bootstrap this result to check its
"sensibleness" and/or jacknife it on various subsets to test its
sensitivity ? What do graphical representations say ?
Do you have reason to suspect heteroscedasticity ? If so, another effort
with lme and variance modelling function might worth the pain...
Last but not least, if individual present at two occasions are not "too"
scarce, a subanalysis in "model 3" might enlighten you, especially if
you can check that they do not differ "way too much" from non-repeating
individuals...
HTH,
Emmanuel Charpentier
> With thanks and best wishes,
> Alan Kelly
>
>
> Le vendredi 26 juin 2009 07:15 +0100, Alan Kelly a crit :
> > Dear all, following a suggestion from one of your contributors, I'm
> > posting the following query to R-sig-mixed having had limited
> response
> > from R-help.
> >
> > I'm attempting to analyse a nested cross-sectional design in which an
> > intervention was offered to a series of randomly selected (small)
> > communities, so the unit of randomisation is the community. All
> > available individuals in each community were interviewed before the
> > intervention and again at follow-up post-intervention. The set of
> > available individuals at baseline and at follow-up was far from
> > identical (a common feature of such designs). Similarly, a series of
> > control communities were interviewed. This type of design is used
> in
> > epidemiological studies particularly in interventions designed to
> > alter lifestyle factors. Such designs tend to be highly unbalanced.
> > Murray et al. discuss the appropriate analysis of such studies
> > (Analysis of data from group-randomized trials with repeat
> > observations on the same groups, Stats in Med. 17, 1581-1600). They
> > offer three examples of SAS code - one of which is as follow:
> >
> > proc mixed;
> > class cond unit timecat;
> > model y=cond timecat cond*timecat/ddfm=res;
> > random int timecat/subject=unit(cond);
> > run;
> >
> > cond is 0/1 corresponding to control/intervention
> > timecat is 0/1 corresponding to baseline/follow-up
> > unit is 1 to 39 and identifies the communities.
> > and y is a continuous score
> >
> > Unfortunately I'm not familiar with SAS code. I would expect random
> > effects for unit and timecat X unit.
> >
> > I would much appreciate any suggestions on how to code the above in
> > lme4 or nlme.
>
> Caution note : I have not used proc mixed in a log time, and I might be
> (quite) rusty). Caveat emptor...
>
> I assume that cond and timecat are coded either as boolean
> (FALSE:control|baseline) or as factors (reference=control|baseline), and
> that unit is a factor.
>
> One thing is not clear in your description : is y measured at the
> community level (i. e. one measure per community per occasion) or at an
> individual level (i. e. n_i measures per community per occasion) ? If
> the latter, can you individualize successive measures (responses of the
> same individual) ? These three eventualities would, IMHO, lead to three
> very different models.
>
> In the first case (community-level answer), an "obvious" coding in lmer
> would be :
>
> lmer(y ~ cond * timecat + (timecat | unit), ...)
>
> In lme, that would be
>
> lme(y ~ cond * timecat, random=~timecat | unit, ...)
>
> In this model, the "cond" fixed effect is in fact the difference between
> conditions at baseline. Since your study is randomized, you should
> expect this effect to be 0. The timecat effect is indeed the "natural
> evolution" of you communities without intervention, and the net effect
> of your intervention is measured by the cond:timecat interaction.
>
> Since you have only one observation per experimental slot, I doubt that
> you will get much precision...
>
> You may also note that, in this case, your plan is balanced by
> definition (1 observation per slot). aov() might do the job...
>
> In the second case (individual answers, not identifiable between
> occasions), the model is written the same way. Here, lme might be
> prefered to lmer because the variance and correlation function might
> allow you to model the variability between experimental slots (e. g.
> heteroscedasticity, between-occasions correlation, etc...).
>
> In the third case (individual answers identifiable between occasions),
> another important coupling appears : the answers given by an individual
> after intervention are quite likely related to the answers before
> intervention. Assuming that an individual belongs to one community only,
> it is tempting to exploit this by writing :
>
> lmer(y ~ cond * timecat + (timecat | unit/individual), ...)
>
>
> If individuals can "switch communities" between occasions (or belong to
> more than one), this simple nesting is not valid. Here, lmer does
> probably a better job than lme...
>
> HTH,
>
> Emmanuel Charpentier
>
>
>
> [[alternative HTML version deleted]]
>
More information about the R-sig-mixed-models
mailing list