[R-sig-ME] model specification for 2x2x2 factorial between-subjects design with repeated measures

Stephen Politzer-Ahles spa268 at nyu.edu
Wed Feb 5 12:18:27 CET 2014


Hi Bruce,

I don't have answers for all of your questions, but I just wanted to
add that I think you should treat Time as another fixed factor. What
you are mainly interested, it sounds like, is how the DV changes from
time 1 to time 2, and whether that change is different between group
c3.0 and group c3.1. So that's a question about the interaction
between time and c3. Ignoring the covariates, the way I would specify
this model (in lmer{lme4} syntax, because that's what I'm more
familiar with):

m <- lmer( DV ~ c3 / time + (1|Subject), data )

(I put 'time' as nested below c3 because I am assuming you're not
interested in the main effect of c3, only the interaction. Also
nesting it like this makes the output of summary(m) easy to read. But
if you want all effects, you can use * instead of / .)

>From there, you could add the covariates without adding all the
interactions (although it's possible that you might care about those
interactions as well, for example if you have reason to think that
people in c2.1 respond better to the treatment c3.1 than people in
c2.0 do):

m1 <- lmer( DV ~ c1 + c2 + (c3/time) + (1|Subject), data )

You can then compare this to a maximally similar model that does not
have the interaction:

m2 <- lmer( DV ~ c1 + c2 + (c3+time) + (1|Subject), data )

And of course you should consider using a maximal random effects
structure (in the models I put above, (1|Subject) means there are only
random intercepts per subject).

The interactions are starting to sound complicated; you might be able
to simplify your work if for each subject you subtract the dv at time
1 from the dv at time 2, so the new DV is just a measure of how much
they improved from time 1 to time 2, and then you don't have to worry
about always testing the interaction with time.


Stephen Politzer-Ahles
New York University, Abu Dhabi
Neuroscience of Language Lab
http://www.nyu.edu/projects/politzer-ahles/



> Message: 2
> Date: Tue, 4 Feb 2014 13:42:05 -0600
> From: Bruce Gilstrap <bruce at gilstraps.org>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] model specification for 2x2x2 factorial
>         between-subjects design with repeated measures
> Message-ID:
>         <CAN_bDiCiyZWZZ1=+qbCZ-Z547t-DBn8PRhRsZzSeDfYR24cWUg at mail.gmail.com>
> Content-Type: text/plain
>
> My colleagues and I are interested in analyzing a dataset generated by a
> 2x2x2 factorial design where each of the 3 factors are between-subjects
> factors, each consisting of 2 levels, which results in 8 experimental
> conditions. Therefore, each subject was randomly assigned to 1 of these
> conditions. Additionally, the DV was measured twice, so our design includes
> a repeated measure. Basically, then, the data look like this (in so-called
> *long* form):
>
> subj  c1  c2  c3  time   dv
>    1   0   0   0     1  1.1
>    2   0   0   1     1  3.2
>    3   0   1   0     1  4.4
>    4   0   1   1     1  6.2
>    5   1   0   0     1  3.6
>    6   1   0   1     1  2.2
>    7   1   1   0     1  5.1
>    8   1   1   1     1  5.5
>    1   0   0   0     2  3.7
>    2   0   0   1     2  6.1
>    3   0   1   0     2  4.7
>    4   0   1   1     2  2.0
>    5   1   0   0     2  4.6
>    6   1   0   1     2  4.3
>    7   1   1   0     2  5.2
>    8   1   1   1     2  5.0
>
> So far, this is a relatively straightforward design. However, it is
> complicated by the fact that the third condition is essentially a
> treatment. That is, it was unknown to the subjects when the DV was measured
> at time 1. After the first measurement of the DV, they were presented with
> a treatment, and then we measured the DV at time 2. Therefore, the design
> includes a pre-/post-test element as well. Our research question is focused
> on the effect of the treatment (c3), and we are not directly interested in
> the effects of c1 and c2. We included these conditions because prior
> research suggests that they may affect our DV. Therefore, we are planning
> to treat these conditions as covariates.
>
> Our plan is to analyze the data using multilevel modeling (in R) due to the
> non-independence of observations created by the repeated measure. Using the
> lme function from the nlme package, here are the models we plan to compare:
>
> m0 <- lme(fixed  = DV ~ 1,
>           random = ~ 1 | subj/time,
>           method = "ML",
>           data = d.long)
>
> m1 <- lme(fixed  = DV ~ 1 + c1 + c2,
>           random = ~ 1 | subj/time,
>           method = "ML",
>           data = d.long)
>
> m2 <- lme(fixed  = DV ~ 1 + c1 + c2 + c3,
>           random = ~ 1 | subj/time,
>           method = "ML",
>           data = d.long)
>
> where:
>
>    - m0 is the null (baseline) model
>    - m1 is what we are calling the covariate model (intended to address the
>    question, "Do the covariates have the effects suggested by prior research?")
>    - m2 is the hypothesized model (intended to address our research
>    question, "Does the treatment have the effect we hypothesized?")
>
> Of course, there is other work to be done after specifying the models, but
> we are most interested at this point in making sure we have specified them
> appropriately.
>
> We would appreciate receiving advice about our plans for analyzing these
> data, and are particularly interested in comments about the following
> issues:
>
>    1. Is our intended approach to treat c1 and c2 as covariates defensible?
>    2. If so, will our R-specific model specifications allow us to answer
>    our research question?
>    3. In any case, are there any better approaches?
>
> Thanks,
>
> Bruce
>
>
> Here is some R code that will generate a basic dataset like the one shown
> above:
>   subj <- rep(c(1,2,3,4,5,6,7,8), times = 2)
>   c1   <- rep(c(0,0,0,0,1,1,1,1), times = 2)
>   c2   <- rep(c(0,0,1,1), times = 4)
>   c3   <- rep(c(0,1), times = 8)
>   time <- c(rep(1, times = 8), rep(2, times = 8))
>   set.seed(14)
>   dv   <- runif(16, 1.0, 7.0)
>   data <- as.data.frame(cbind(subj, c1, c2, c3, time, dv))
>   data[ ,1:5 ] <- lapply(data[ ,1:5 ], factor)
>   data
>   str(data)
>
>         [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 4 Feb 2014 14:57:13 -0500
> From: Koen Hufkens <koen.hufkens at gmail.com>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Recreating tree allometry model comparisons (Banin
>         et      al. 2012)
> Message-ID:
>         <CAHDtqqmLaNTws_LrPktsiaKocw9-8hO8uA3AZXMZJAuPPxjADw at mail.gmail.com>
> Content-Type: text/plain
>
> Hi list,
>
> I have the feeling this came through the list and my message bounced, if it
> did not excuses for double posting.
>
> Using mixed models has been a while for me so I'm struggling with
> recreating an approach as described in this publication (Banin et al. 2012:
> http://onlinelibrary.wiley.com/doi/10.1111/j.1466-8238.2012.00778.x/full).
>
> A description of the methodology is described here: "To determine whether
> curve parameters were significantly different amongst all pairs of
> continents, the dataset was split into all possible paired combinations and
> a full model (where continent was specified as a fixed factor to provide
> separate estimates for two continents) compared to a reduced model (where
> data for the two continents were pooled) (Motulsky & Christopoulos, 2004)...
>
> ... Full and reduced models were compared using AICs and likelihood ratio
> tests, which provide a p-value and an opportunity to assess whether models
> provide a significant improvement (Pinheiro & Bates, 2000) . "
>
> The model or curve as mentioned above is defined as
>
> H ~ a - b * exp(-c * DBH) where H and DBH are the measured tree height and
> tree diameter at breast height, respectively, see equation 1, in the above
> paper.
>
> DBHmodel <- function(x,a,b,c){a-b*exp(-c*x)}
>
> Instead of continents I would like to pairwise compare site locations,
> hence the setup will not change as I will substitute 'continent' for 'site'.
>
> I define the 'full' model as such:
>
> dataGrouped <- groupedData(height~dbh | site, data=data)
> model1 <-
> nlsList(height~DBHmod(dbh,a,b,c),data=dataGrouped,start=list(a=40,b=40,c=0.02))
> model1.nlme <- nlme(model1)
>
> This gives me two site based models but I'm at a loss on how to 1) define
> the reduced 'population' model such such that it plays nicely with nlme and
> treats the data as 'one' 2) perform the likelihood ratio tests on this due
> to the lack of the former.
>
> Any help is appreciated while I wander through Pinheiro and Bates to look
> for a solution as well. I feel that this should be trivial but I'm still
> lost in the nlme model formulation.
>
> Kind regards,
> K
>
> --
> Dr. Koen Hufkens
>
> Harvard University
> Department of Organismic & Evolutionary Biology
> Richardson Lab
> ~
> Ghent University
> Faculty of Bioscience Engineering
> Isotope Bioscience Laboratory
> <http://www.isofys.ugent.be>
>
>         [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 5 Feb 2014 09:56:52 +1000
> From: David Duffy <David.Duffy at qimr.edu.au>
> To: Adam Hayward <a.hayward at sheffield.ac.uk>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] prunePed in MCMCglmm
> Message-ID: <alpine.LMD.2.00.1402050931590.2880 at orpheus.qimr.edu.au>
> Content-Type: text/plain; charset="US-ASCII"; format=flowed
>
> On Tue, 4 Feb 2014, Adam Hayward wrote:
>
> > Hi all,
> >
> > I'm having problems using the prunePed function in the MCMCglmm package and
> > wondered if anyone else had come across a similar issue.
> >
> > The result is that I end up with a "prunedpedigree" file which contains
> > 5,161 records- one for each phenotyped individual, giving their dam and
> > sire. However, it does not provide additional records for the parentage of
> > their parents or other informative relatives.
>
> ISTM prunePed() doesn't work if the individual IDs are factors (numeric
> and character are OK).  With keep=FALSE, prunePed() reduces to this loop:
>
>      nind <- length(ind.keep) + 1
>      while (length(ind.keep) != nind) {
>          nind <- length(ind.keep)
>          ind.keep <- union(na.omit(c(unlist(pedigree[, 2:3][match(ind.keep,
>              pedigree[, 1]), ]))), ind.keep)
>      }
>      pedigree <- pedigree[sort(match(ind.keep, pedigree[, 1])),]
>
> which fails for factor IDs in the union of the parental IDs with the
> existing set of individual IDs. This was a bit confusing in that the
> BTped example uses factor IDs.
>
> Cheers, David Duffy.
>
> | David Duffy (MBBS PhD)
> | email: David.Duffy at qimrberghofer.edu.au  ph: INT+61+7+3362-0217 fax: -0101
> | Genetic Epidemiology, QIMR Berghofer Institute of Medical Research
> | 300 Herston Rd, Brisbane, Queensland 4006, Australia  GPG 4D0B994A
>
>
>
> ------------------------------
>
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
> End of R-sig-mixed-models Digest, Vol 86, Issue 5
> *************************************************



More information about the R-sig-mixed-models mailing list