[R-sig-ME] I would need some advice about repeated anova 2 ways and how to code it within lmer function

Fri Oct 14 06:23:28 CEST 2022

Hi Claire,

Unfortunately some of your message got garbled in the conversion to
plain text, so it's a little bit hard to read. But if I understand
correctly, you're considering this model:

hearing ~ TimePoint * Frequency + (1|Ear/TimePoint) +(1|Ear/Frequency)

with sum contrasts. (I've changed the variable names to make the
essential parts of the design more explicit.)

There were 3 levels of TimePoint and 5 levels of Frequency and 18
different Ears measured. All Ears were measured at all TimePoints and
Frequencies. So in total you have 3 * 5 * 18 = 270 observations and 18
levels of the grouping variable. Both numbers are important for mixed
models.

Is this correct?

First: I'm glad you specified your contrasts! It's a common mistake to
forget this (see e.g. Laurel Brehm and my paper on this:
https://doi.org/10.1016/j.jml.2022.104334 or postprint
https://osf.io/9648f) and if you want comparability to traditional
repeated measures ANOVA, then sum contrasts are the way to go.

Next up: I don't think your random effects specification is correct. A
good heuristic is that the same variable should never appear in the
fixed effects and on the right-hand side of the | in the random effects.
There are exceptions to this rule, but generally speaking they're more
for advanced applications.

In your particular case, your random effects are equivalent to:

(1|Ear) + (1|Ear:TimePoint) +(1|Ear) + (1|Ear:Frequency)

So right away, we see that there's a repeated term, which I believe lme4
will drop automatically, but will otherwise be shrunk to zero. So in
other words, we have

(1|Ear) + (1|Ear:TimePoint) + (1|Ear:Frequency)

which turns out to be a special case of

(1 + TimePoint + Frequency | Ear)

when the covariance matrix is restricted to be compound symmetric.
Compound symmetry is related to the sphericity assumption in classical
rmANOVA, but it is an additional assumption and restriction. I don't
think it's necessarily misguided in your case though because the
unrestricted random effects structure (1 + TimePoint + Frequency | Ear)
has 28 parameters (1 intercept, 2 slopes for TimePoint, 4 slopes for
Frequency, 21 correlations). But you since the random effects are
ultimately estimating the covariance between groups, you would be trying
to estimate 28 parameters from 18 groups (ears)!

Since you're mostly interested the effect of TimePoint and don't care
about Frequency, I would probably also consider the mdoel

hearing ~ TimePoint * Frequency + (1 + TimePoint | Ear)

This leaves out Frequency from the random effects, which simplifies
things a lot. The random intercept for a particular grouping variable
(i.e. 1|Ear for your stuff) tends have the most influence on the
estimates of the fixed effects (there is a deep relationship to
Simpson's Paradox). The biggest changes from adding random slopes tend
to be in the standard errors (and hence t-statistics) for the fixed
effects of the same variable. So dropping Frequency from the random
effects might lead to anti-conservative standard errors for Frequency
and the Time:Frequency interaction, which can inflate your Type-I error
rate. But if you're not interested in Frequency, I think that's okay
(see also https://doi.org/10.1016/j.jml.2017.01.001). By keeping the
random intercept and random slopes for TimePoint together, we allow for
them to be correlated. In other words, this would allow for the change
over time to be correlated with the starting point for each Ear, which I
suspect is relevant for your research and could help capture things like
ceiling effects.

Typically, I would also do a bit more exploration of the model using
things like rePCA (see https://arxiv.org/abs/1506.04967) and plotting
the data. I encourage you to explore these tools as well.

Finally given that you're apparently looking at change over time, I
might consider using Helmert or sequential difference contrasts for
TimePoint so that you have comparisons of each step of improvement
instead of just comparisons to the baseline.

As Jeff noted, that isn't a huge amount of data, but I suspect it's
enough to fit a model and take a look. Statistical power is also a
function of effect size, so even if you can fit a model, I don't know if
you'll have sufficient power to detect any effect.

In contrast to Jeff, I think that the small number of observed
timepoints and frequencies is actually an advantage here because you're
treating them as discrete, categorical entities. Moreover, I know that
frequency response in mammalian hearing is a non-linear function and I
suspect changes over time is as well, so I think treating these as
categorical is much easier than trying some type of nonlinear
estimation. If you had more timepoints and frequencies, then number of
associated contrasts and thus model complexity would explode.

If you're only interested the question "does hearing change from T1 to
T28?" then I might even exclude the "T21" data to further simplify the
picture.

Hope that helps,
Phillip

On 11/10/22 1:27 am, claire.dv04 using gmail.com wrote:
> Dear all
> 
> I would appreciate some advice about how to analyze my data.
> 
> These data come from an experiment during which the hearing of 18 ears was
> measured over time and according to different frequencies.
> 
> Each ear was measured at T1, T21, and T 28, and at each of its times, each
> ear was measured at frequencies 4000Hz, 8000Hz, 16000Hz, 25000Hz, and
> 32000Hz.
> 
> I especially want to know if there is a time effect. The frequency effect
> does not particularly interest me.
> 
> I was thinking of using a 2-factor anova for repeated data (with 2 factors
> within: time (=Point)  and frequency), but I$B!G(Bm note sure.
> 
> I work with R and I thought to use the lmer function with this code:
> 
> mod.lmer $B"+(B lmer(hearing ~ Point * Hz Frequency +(1|id/Point) +(1|id/Hz
> Frequency),
> contrasts=list(Point=contr.sum, Frequency Hz=contr.sum),
> data=mydata)
> 
> id is the ear identification factor
> 
> I$B!G(Bm not sure about the coding of random effects (1|id/Point)
> +(1|id/Frequency Hz), but they give me the same results as the aov.ez
> function of the afex package :
> 
> mod.ez $B"+(B aov_ez(id=$B!H(Bid$B!I(B,
> dv=$B!H(BThreshold dB$B!I(B,
> data=mydata,
> within = c($B!H(BPoint$B!I(B, $B!H(BFrequency Hz$B!I(B))
> 
> What do you think ?
> 
> Does this analysis seem correct to you? If not, what can you suggest me?
> 
> Thanks in advance for any help you can give me.
> 
> All the best
> 
> Claire Della Vedova
> 
>  
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>