[R-sig-ME] Help with coding an intervention analysis with lmer()

Sun Aug 29 12:03:07 CEST 2010

Many thanks for the input! This is very interesting.

On 28 Aug 2010, at 17:00, Ista Zahn wrote:

> Hi all,
> I hesitate to weigh in on this, because I still consider myself a
> newbie with lme4. But I really think you're on the wrong track here,
> and I trust that someone will correct me if I'm wrong.
> 
> 
> 
> On Sat, Aug 28, 2010 at 2:58 AM, Dave Schoeman <d.schoeman at ulster.ac.uk> wrote:
>> 
>> Hi, many thanks for the thoughts; it is extremely useful to have outside input.
> 
>> <snip>

<snip>

>> I defined the random effect of (1| Group/Subject) purely on the basis of the experimental design: Subject is nested in Group (if you agree with the logic above).
> 
> I think this is where you go off the rails. Nesting refers to
> _grouping variables_, not to fixed effects. If I'm understanding
> correctly, group is not a grouping variable, it is a fixed effect with
> 2 repeatable levels: control and experimental.

OK, I think this is a fundamental question. In my understanding, nesting arises when levels of one factor (here Subject) are present in only one level of another factor (here Group). In other words, Subjects 1-10 occur only in the Control Group, and Subjects 11-20 occur only in the Experiment Group. If each Subject occurred in each Group, Subject and Group would be crossed. The same goes for BA and Year: Years 1990-1993 are Before and Years 1994-1997 are After, so again Year is nested within BA. If this nesting is correct, then should I not try to estimate the effects of each? And if I should, the question is HOW? I agree with you that my model specification looks like gobbledygook, but it is just an attempt to model all components of variance.

<snip>

>> The term BA/Year:Group is unclear, I agree. What it means is that I want to add an interaction term on a nested factor: Year (within BA) x Group. Similarly Group/Subject:BA is the interaction of Subject (within Group) x BA. Finally, there SHOULD be a term for Year (within BA) x Subject (within Group), but these data are unreplicated at this level, so this term is not estimable in conventional anova (so it is omitted to contribute to the residual variance). I understand that these interactions are difficult to read, and I tried to resolve this by isolating the nested terms in brackets, but this seems to make no difference to R.
> 
> I don't think this is right. I don't think you need to specify nesting
> this way, especially in the fixed-effects part of the model. The way I
> see it, you have only one grouping factor, and that is Subject.
> 
> If we go back to your original model specification, you had
> 
> mod <- lmer(Attack ~ Group*BA + BA/Year + BA/Year:Group +
> Group/Subject:BA + (1|Group/Subject), data = dat, family = poisson)
> 
> in my view that is gobbledygook. You have Group as a fixed effect
> predictor, but also as a grouping factor, and I don't think that makes
> any sense. I also don't think that those nesting statements in the
> fixed-effects specification make any sense, but it's possible I'm the
> one who doesn't understand lmer syntax... Anyway, I would do something
> like
> 
> Subject <- c(rep(c(1:20), 8))
> Group <- c(rep(c(rep("C", 10), rep("E", 10)), 8))
> BA <- c(rep("Bef", 80), rep("After", 80))
> Year <- c(rep(1990, 20), rep(1991, 20), rep(1992, 20), rep(1993, 20),
> rep(1994, 20), rep(1995, 20), rep(1996, 20), rep(1997, 20))
> 
> Attack <- c(rpois(10, lambda = 10), rpois(10, lambda = 12), rpois(10,
> lambda = 10), rpois(10, lambda = 12), rpois(10, lambda = 10),
> rpois(10, lambda = 12), rpois(10, lambda = 10), rpois(10, lambda =
> 12), rpois(10, lambda = 15), rpois(10, lambda = 3), rpois(10, lambda =
> 15), rpois(10, lambda = 3), rpois(10, lambda = 15), rpois(10, lambda =
> 3), rpois(10, lambda = 15), rpois(10, lambda = 3))
> 
> dat <- data.frame(Attack, Subject, Group, BA, Year)
> dat$Subject <- factor(dat$Subject)
> dat$Year <- dat$Year - 1994
> dat$BA <- relevel(dat$BA, ref="Bef")
> 
> 
> library(lme4)
> 
> mod <- lmer(Attack ~ Group*Year + Group*BA + (1|Subject), data=dat,
> family=poisson)
> 
> This tests whether the difference in Attack before and after differs
> by Group, and whether the trajectory of Attack over time (Year)
> differs by Group. Note that year is numeric and re-centered at 1994
> (the year the intervention started).

Yes, I agree that your model tests the hypotheses I'm interested in, and the model-building process ends up in the same place, irrespective if whether you start with my model or yours (both with the simulated data provided, or with the real data). BUT, should the model not start by including all terms?

> Hope this was helpful. And please, those of you who know more than I
> do, please jump in and correct me if I've given bad advice here...

Yes, very helpful, indeed. Thanks. And thanks, also, to Dennis who wrote earlier.

I do think that it would be useful, though, to get further input on the nesting concepts here and how to specify them using the fixed and random effects...

- Dave

> 
> Ista

>> On 27 Aug 2010, at 19:50, Dennis Murphy wrote:
>> 
>>> Hi:
>>> 
>>> On Fri, Aug 27, 2010 at 9:12 AM, Dave Schoeman <d.schoeman at ulster.ac.uk> wrote:
>>> I tried this in a slightly different context and got no response, so I thought I'd try and frame the analysis slightly differently. The design is quite simple, but the analysis is proving difficult. Say I have 20 individual Subjects prone to mild anxiety attacks, half of which are assigned to a Control group and half to an Experimental group. I monitor these patients for 8 consecutive years, counting the number of anxiety attacks each year. After the initial 4 years, I provide members of the Control group a placebo and members of the Experimental group a supposed effective treatment. The hypothesis is that the supposed effective treatment has no effect in reality.
>>> 
>>> As I see it, I have several fixed effects, all of which are factors: Group (Control or Experiment); BA (Before "treatment" or After "treatment"); and Year (nested within BA). There is a single random effect: Subject (nested within Group).
>>> 
>>> Data can be simulated as follows (I have coded the attack rate Before slightly higher in the Experiment than the Control Group, with the attack rate increasing slightly after dosing in the Control group, but decreasing substantially in the Experiment group):
>>>        Subject <- c(rep(c(1:20), 8))
>>>        Group <- c(rep(c(rep("C", 10), rep("E", 10)), 8))
>>>        BA <- c(rep("Bef", 80), rep("After", 80))
>>>        Year <- c(rep(1990, 20), rep(1991, 20), rep(1992, 20), rep(1993, 20), rep(1994, 20), rep(1995, 20), rep(1996, 20), rep(1997, 20))
>>>        Attack <- c(rep(c(rpois(10, lambda = 10), rpois(10, lambda = 12)), 4), rep(c(rpois(10, lambda = 12), rpois(10, lambda = 3)), 4))
>>>        dat <- data.frame(Attack, Subject, Group, BA, Year)
>>>        dat$Subject <- as.factor(dat$Subject)
>>>        dat$Year <- as.factor(dat$Year)
>>> The model seems like it should be:
>>>        mod <- lmer(Attack ~ Group*BA + BA/Year + BA/Year:Group + Group/Subject:BA + (1|Group/Subject), data = dat, family = poisson)
>>> 
>>> Group is assigned at the time of the intervention; everything before it is baseline data. I could see comparisons between before/after trends at the subject level that would depend on Group assignment, but the Group * BA interaction is a bit of a problem, since no treatment is assigned in the 'Before' phase. The (1 | Group/Subject) term makes some sense (random intercepts), but you need something to generate and compare random slopes before and after treatment by Group and Subject. I'm not quite sure how that would be done; I'm concerned with maintaining the quantitative values of Year but estimating the slopes and intercepts by before/after treatment assignment. There must be a standard way to do this with biomedical data, but I don't have a lot of experience in that area. Hopefully, others will have better ideas.
>>> 
>>> I don't see how Group is nested within BA (BA/Year:Group), especially when your first term is Group * BA, which suggests they are crossed. Group is not even assigned temporally until the After phase of BA, so it's clearly not nested within the levels of BA. It is more accurate to say that it is at best partially crossed, but even that is debatable - the groups are split in the After phase. You can't have it both ways - either two terms are crossed or they are nested...or in this case, perhaps somewhere in between. This is the part you need to unravel properly. Conceptually, I can see why you think Year should be nested within BA, but Year is a quantitative variable rather than a factor, and your aim should be to predict random intercepts and slopes among subjects, and compare mean differences between the control and experimental groups, before and after the treatments were applied.
>>> 
>>> This type of problem must occur in biostatistics at least occasionally. How is this traditionally handled when you have longitudinal data by subject, where part of it is baseline (pre-treatment) and part of it is post-treatment?  (BA * Year | Group/Subject) ??
>>> 
>>> 
>>> BUT, this gives an error: "Error in mer_finalize(ans) : Downdated X'X is not positive definite, 1."
>>> 
>>> This partially explains why:
>>> 
>>> library(ggplot2)
>>> g <- ggplot(dat, aes(x = Year, y = Attack, groups = Subject))
>>> g + geom_line()
>>> g + geom_line() + facet_grid(Group ~ BA, scales = 'free')
>>> 
>>> You have no within-subject variablility within before/after phases and several subjects overlap in at least one of the phases.
>>> 
>>> To avoid this error, I have to drop all of the interaction terms involving nesting. HOWEVER, if I code Year as a continuous covariate, I need only drop the term Group/Subject:BA. This suggests that interaction terms comprising only factors are not easily fit using this unreplicated design...
>>> 
>>> I believe you have hit the nail on the head....rather than unreplicated, I would say incompletely crossed [BA * Group], because there is replication in the BA and Group factors.
>>> 
>>> My questions are these:
>>> 1 - Does my initial model look correct (and do I need to specify the nesting - other threads suggest that this might not be necessary if nested variables have unique codes)?
>>> 2 - Why is it that the  factor-by-factor interactions won't work (I'm an lmer() novice, so it might be very obvious to some...)?
>>> 3 - Would I be more or less correct to simply ignore the Group/Subject:BA interaction and proceed with model development from there?
>>> 
>>> Any comments appreciated.
>>> 
>>> - Dave
>>> 
>>> 
>>> I don't think I've helped resolve the form of model you need, but it seems that certain aspects of your design have to be massaged. Hopefully others can weigh in on the problem...
>>> 
>>> Regards,
>>> Dennis
>>> 
>>> 
>>> _________
>>> Dr Dave Schoeman
>>> Lecturer in Marine Science
>>> School of Environmental Sciences
>>> University of Ulster
>>> Cromore Road
>>> Coleraine
>>> BT52 1SA
>>> Northern Ireland
>>> 
>>> Phone: +44 (0)28 701 24076
>>> Mobile: +44 (0)75 49 526 743
>>> Fax: +44 (0)28 701 24491
>>> 
>>> Treat data with care; under duress they will tell you anything you want to hear...
>>> 
>>> 
>>>        [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>