[R-sig-ME] Re. nested cross-sectional model using lme4 or nlme

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Tue Jun 30 18:32:39 CEST 2009


Dear List, dear Alan,

Le mardi 30 juin 2009 à 11:00 +0100, Alan Kelly a écrit :
> Emmanuel - very many thanks for your detailed reply.
> Some points of clarification and some results... and I should mention  
> that I'm analysing these data on behalf of a colleague.
> Although the unit of randomization is the community, data are indeed  
> available at the level of the individual (but the same individuals  
> could not be guaranteed to be available at follow-up - which forces me  
> to adopt a nested cross-sectional design rather than a nested cohort  
> design) so your suggested model 2 is appropriate.
> "cond" and "timecat" and "unit" are factors.
> The imbalance I referred to arises from the variable numbers of  
> individuals available per community at baseline and again at follow-up  
> - only the number of communities (units) were fixed.
> You suggest that lme might be preferable to lmer in this case - point  
> taken, however, the model failed to run under lme - giving the  
> following message:
> 
> reg=lme(y~cond*timecat, random= ~timecat | unit)
> Error in lme.formula(y ~ cond * timecat, random = ~timecat | unit) :
>    nlminb problem, convergence error code = 1
>    message = iteration limit reached without convergence (9)

Did you try to force a larger iteration limit (using
control=lme.control(someparameterIcantrememberjustnow =
somethingmorethanthedefault) ? ISTR that a casual remark (in P&B ? In
V&R4 ?) noted that this is often necessary...

> Whereas it did run successfully under lmer....
> 
> reg=lmer(y~cond*timecat + (timecat|unit))
>  > summary(reg)
> Linear mixed model fit by REML
> Formula: y ~ cond * timecat + (timecat | unit)
>    AIC  BIC logLik deviance REMLdev
>   8628 8670  -4306     8611    8612
> Random effects:
>   Groups   Name        Variance Std.Dev. Corr
>   unit     (Intercept)  0.67586 0.82211
>            timecat      0.79892 0.89382  -1.000
>   Residual             34.92760 5.90996
> Number of obs: 1346, groups: unit, 37
> 
> Fixed effects:
>               Estimate Std. Error t value
> (Intercept)   12.0166     0.3118   38.53
> cond          -0.2387     0.5241   -0.46
> timecat       -1.0334     0.4498   -2.30
> cond:timecat   0.3727     0.7610    0.49
> 
> Correlation of Fixed Effects:
>              (Intr) cond   timect
> cond        -0.595
> timecat     -0.711  0.423
> cond:timect  0.420 -0.708 -0.591
> 
> For this outcome at least, the interaction term (essentially a  
> difference-in-differences estimate) is non-significant.... a little  
> disappointing, given the effort!

Shit happens ... Did you bootstrap this result to check its
"sensibleness" and/or jacknife it on various subsets to test its
sensitivity ? What do graphical representations say ?

Do you have reason to suspect heteroscedasticity ? If so, another effort
with lme and variance modelling function might worth the pain...

Last but not least, if individual present at two occasions are not "too"
scarce, a subanalysis in "model 3" might enlighten you, especially if
you can check that they do not differ "way too much" from non-repeating
individuals...

HTH,

					Emmanuel Charpentier

> With thanks and best wishes,
> Alan Kelly
> 
> 
> Le vendredi 26 juin 2009  07:15 +0100, Alan Kelly a crit :
>  > Dear all, following a suggestion from one of your contributors, I'm
>  > posting the following query to R-sig-mixed having had limited  
> response
>  > from R-help.
>  >
>  > I'm attempting to analyse a nested cross-sectional design in which an
>  > intervention was offered to a series of randomly selected (small)
>  > communities, so the unit of randomisation is the community.  All
>  > available individuals in each community were interviewed before the
>  > intervention and again at follow-up post-intervention.  The set of
>  > available individuals at baseline and at follow-up was far from
>  > identical (a common feature of such designs).  Similarly, a series of
>  > control communities were interviewed.  This type of design is  used  
> in
>  > epidemiological studies particularly in interventions designed to
>  > alter lifestyle factors.  Such designs tend to be highly unbalanced.
>  > Murray et al. discuss the appropriate analysis of such studies
>  > (Analysis of data from group-randomized trials with repeat
>  > observations on the same groups, Stats in Med. 17, 1581-1600).  They
>  > offer three examples of  SAS code - one of which is as follow:
>  >
>  > proc mixed;
>  > class cond unit timecat;
>  > model y=cond timecat cond*timecat/ddfm=res;
>  > random int timecat/subject=unit(cond);
>  > run;
>  >
>  > cond is 0/1 corresponding to control/intervention
>  > timecat is 0/1 corresponding to baseline/follow-up
>  > unit is 1 to 39 and identifies the communities.
>  > and y is a continuous score
>  >
>  > Unfortunately I'm not familiar with SAS code.  I would expect random
>  > effects for unit and timecat X unit.
>  >
>  > I would much appreciate any suggestions on how to code the above in
>  > lme4 or nlme.
> 
> Caution note : I have not used proc mixed in a log time, and I might be
> (quite) rusty). Caveat emptor...
> 
> I assume that cond and timecat are coded either as boolean
> (FALSE:control|baseline) or as factors (reference=control|baseline), and
> that unit is a factor.
> 
> One thing is not clear in your description : is y measured at the
> community level (i. e. one measure per community per occasion) or at an
> individual level (i. e. n_i measures per community per occasion) ? If
> the latter, can you individualize successive measures (responses of the
> same individual) ? These three eventualities would, IMHO, lead to three
> very different models.
> 
> In the first case (community-level answer), an "obvious" coding in lmer
> would be :
> 
> lmer(y ~ cond * timecat + (timecat | unit), ...)
> 
> In lme, that would be
> 
> lme(y ~ cond * timecat, random=~timecat | unit, ...)
> 
> In this model, the "cond" fixed effect is in fact the difference between
> conditions at baseline. Since your study is randomized, you should
> expect this effect to be 0. The timecat effect is indeed the "natural
> evolution" of you communities without intervention, and the net effect
> of your intervention is measured by the cond:timecat interaction.
> 
> Since you have only one observation per experimental slot, I doubt that
> you will get much precision...
> 
> You may also note that, in this case, your plan is balanced by
> definition (1 observation per slot). aov() might do the job...
> 
> In the second case (individual answers, not identifiable between
> occasions), the model is written the same way. Here, lme might be
> prefered to lmer because the variance and correlation function might
> allow you to model the variability between experimental slots (e. g.
> heteroscedasticity, between-occasions correlation, etc...).
> 
> In the third case (individual answers identifiable between occasions),
> another important coupling appears : the answers given by an individual
> after intervention are quite likely related to the answers before
> intervention. Assuming that an individual belongs to one community only,
> it is tempting to exploit this by writing :
> 
> lmer(y ~ cond * timecat + (timecat | unit/individual), ...)
> 
> 
> If individuals can "switch communities" between occasions (or belong to
> more than one), this simple nesting is not valid. Here, lmer does
> probably a better job than lme...
> 
> HTH,
> 
> 					Emmanuel Charpentier
> 
> 
> 
> 	[[alternative HTML version deleted]]
>




More information about the R-sig-mixed-models mailing list