[R-sig-ME] lmer for IV by design interaction?

Tue Sep 22 16:55:42 CEST 2020

For the fixed effects, lme4 doesn't care whether things are between-,
within- or mixed. Older software cared a lot about this because it made
it possible to make various simplifying assumptions and thus speed up
computations, but it's not necessary using modern approaches.

I'm not sure I understand what D is representing. If you have multiple
measurements from a single subject, then that should present itself
simply as multiple rows in the dataframe. Likewise, if you don't have
multiple measurements, then that will also be obvious from the data. If
it's simply a matter of which "original" experiment the data came from;
well then you can include that as factor in the analysis, but I would
expect that effect to be null (unless of course there is some
domain-specific reason why a within-subjects manipulation would yield
different results than a between-subjects manipulation). Within-subjects
designs generally provide better estimates, so I wouldn't be surprised
if the interaction effect is present but small (look at the model
coefficients, not ANOVA for this).

Regarding the random effects: you could actually fit a by-subjects slope
for A (i..e (1+A|subject) ). This may seem strange at first because "A"
would not seem to be directly estimable for subjects who only saw one
level of A. But that's where the magic of mixed models kicks in: in such
cases, the model can use the "estimates" (technically "predictions")
from the other subjects as well as the population level estimate to fill
in the gaps. The reason why this works is that uncertain estimates are
*shrunk* towards the population level estimate. John Kruschke has an
example of this shrinkage with figures here:
https://doingbayesiandataanalysis.blogspot.com/2019/07/shrinkage-in-hierarchical-models-random.html

Best,

Phillip

On 20/09/2020 19:22, Michelle Ashburner wrote:
> Greetings,
>
> Is it possible/appropriate to use lme4::lmer() to compare the effect of an
> independent variable across two designs: within-subjects and
> between-subjects?
>
> Data below from Erlebacher (1977), used to illustrate his methodology:
>
> ###
> dataset <- NULL
>
> dataset$A <- c(rep.int(1, 20), rep.int(2, 20), rep.int(1, 20), rep.int(2,
> 20))
> #A is the independent variable.
> #1, 2 represent the two levels of the IV
>
> dataset$D <- c(rep.int(1, 40), rep.int(2, 40))
> #D is the design factor.
> #1 represents a within-ss measurement; 2 a between-ss measurement.
>
> dataset$S <- c(60, 73, 93, 10, 90, 80, 83, 37, 83, 70, 77, 7, 100, 70, 100,
> 43, 43, 83, 40, 73, 36, 53, 66, 0, 73, 43, 20, 10, 26, 40, 60, 3, 53, 26,
> 63, 6, 3, 30, 7, 10, 53, 77, 2, 38, 68, 92, 3, 15, 67, 53, 58, 20, 17, 40,
> 85, 60, 25, 3, 82, 67, 62, 0, 57, 42, 3, 55, 22, 28, 45, 47, 52, 75, 38,
> 45, 65, 50, 2, 0, 10, 60)
>
> dataset <- as.data.frame(dataset)
> ###
>
> Erlebacher's analysis on these data can be computed using code developed by
> Merritt, Cook, and Wang (2014):
>  https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS
> <https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS>
>
> The output of an Erlebacher's ANOVA for these data is:
> Effect of A: F(1, 51) = 21.25,
> Effect of D: F(1, 42) = 0.89
> Effect of A x D = F(1, 51) = 7.88
> (df obtained via Satterthwaite's (1946) Method)
>
> Some have suggested a multilevel model with the IV and the design as fixed
> effects; subject as a random effect, instead of the Erlebacher's ANOVA. For
> example, this Stack Exchange discussion:
> https://stats.stackexchange.com/questions/414995/statistically-testing-the-impact-of-a-within-subject-vs-between-subject-design
>
> While the following gives similar results, I am unable to determine if this
> is the correct approach:
>
> ###
> dataset$A <- as.factor(dataset$A)
> dataset$D <- as.factor(dataset$D)
> dataset$subject <- c(rep(1:20, times = 2), 21:60)
>
> library(lme4)
> library(lmerTest)
> anova(lmer(S ~ A + D + A*D + (1|subject),
>            dataset,
>            contrasts = list(A = "contr.sum", D = "contr.sum")))
> ###
> Which outputs:
> -------
> Type III Analysis of Variance Table with Satterthwaite's method
>      Sum Sq    Mean Sq    NumDF  DenDF    F value       Pr(>F)
> A   3097.70    3097.70         1         75.004   21.7904   0.00001304 ***
> D    123.74     123.74           1         53.030   0.8704     0.355067
> A:D 1148.50    1148.50        1         75.004   8.0790     0.005765 **
> -------
>
> As of yet, I am unable to manage a theoretical manipulation of Erlebacher's
> model to fit a multilevel model like the one above, which adds to my
> confusion regarding whether one can use a MLM approach for this type of
> data.
>
> Thank you in advance for any advice.
>