[R-sig-ME] Starting point for modeling a within-subject design

Wed Oct 16 20:56:27 CEST 2024

On 10/7/24 22:19, Simon Harmel wrote:
> Dear Mixed-Effects Experts,
> 
> Suppose a causal mechanism where a normally distributed outcome is impacted
> by condition (a binary factor variable), and a mediator (a
> continuous variable) that sits between the condition and outcome.
> 
> Here, all subjects get to taste both conditions by counterbalancing. That
> is, based on chance, some will first get one condition at one point, and
> later, they will get another condition.
> 
> In a sense, this is a within-subject design with a data structure like:
> subject condition   outcome  mediator
>   1          complex       25.1        6
>   1          simple          11.0        4
>   2          complex       24.3        7
>   2          simple          12.2        3
> QUESTIONS:
> 1) Can we think of this model as a multivariate model where the outcome and
> mediator are indeed 2 correlated DVs that are impacted by condition?

   I don't know.  Some questions to consider: (1) is the mediator 
considered to be measured/observed with or without error? (2) are you 
interested in making inferences about the true value of the mediator? My 
guess would be that at least in the linear/Gaussian case, the estimates 
of the direct and combined effects wouldn't be biased by treating the 
observed value of the mediator as the true value (although I guess the 
uncertainty would be underestimated).

> 
> 2) Given that the same participants get both levels of condition, should
> levels of condition in each subject be correlated at the latent level as in
> (condition | subject) and/or possibly at the residual level as in nlme::
> corClasses?
> 

    The levels of condition are categorical (complex/simple), right? And 
they're not random variables ... Or do you mean the **effects** of the 
level of each condition?

   You could in principle fit (condition|subject), but you'll have an 
identifiability problem if you only have two observations per subject 
[as in the 'starling' example here: 
https://bbolker.github.io/morelia_2018/notes/mixedlab.html]

   If variability varied by condition, you could estimate that ...

> 3) Is there any frequentist software to analyze such data, and if not, does
> the following bayesian model sound like a good "starting point"?
> 
> library(brms)
> mediator_formula <- bf(mediator ~ condition + (condition | subject))
> 
> outcome_formula <- bf(outcome ~  condition +  mediator + (condition |
> subject))
> 
> model <- brm(mediator_formula + outcome_formula,
>                 data = DATA,
>                 seed = 123,
>                 chains = 4,
>                 iter = 10000)
> 
> Thank you all for your expertise,
> Simon
> 

   identifiability is in principle less of an issue with Bayesian 
methods (since _in theory_ the sampler should be able to integrate over 
all possible combinations of the confounded/unidentifiable variables), 
but in practice this will also cause problems unless your priors are 
relatively informative.