[R-sig-ME] Interaction effects with GAMM

Tue Mar 5 17:35:47 CET 2019

Dear Louise,

knowing a bit about your data helps in knowing which smoother to use.

>From my basic understanding of GAMs and after looking through Wood's
"Generalized Additive Models" (2nd ed):

ti(age) : I don't think this is actually any different from s(age), and
for the examples in the book, main effects that have been separated out
are usually computed with s(). The reason I think this that ti() is a
tensor product smooth, but there is no actual tensor product when
dealing with only one covariate, so this reduces down to a 1d smoother.
For the 2+d case, this is not the case: s(x,y) is isotropic, i.e. forces
both x and y to share the same scale and wigliness, while ti(x,y) and
te(x,y) do not force that.

ti(age) + ti(age, by=sex) and brain ti(brain, by=age): seems to fit the
advice on pp. 326-327 that a main effect be included for smooth-factor
interactions, but as in above, I think you can simplify all of your ti()
to s() without any change. ti() and te() are definitely equivalent here
because there are no lower order interactions or main effects to exclude
from a single predictor -- ti() and te() only really differ when there
are two covariates not specified by the by= argument because the
predictor in by= argument isn't a smoother covariate but rather
specifies that multiple smoothers are produced (see Chapter 5, and later
p. 334 "... isotropic smooths, of the sort produced by s(X,Y) terms, are
usually good choices when the covariates of the smooth are naturally on
the same scale, and we expect that the same degree of smoothness is
appropriate with respect to both covariate axes...").

I'm also not sure that you need a smoother on the main effect of age,
but that's a question to be determined via model comparison.

The combination of your within and between subject age scale should
allow estimating both fixed and random effects of age.

So I would start with:

M = gamm(behav ~ age + sex + education + s(age, by = sex) + brain +
s(brain, by = age), random = list(subjectID = ~1+age), data = data)

and then see (via AIC) if adding a marginal smoother for age helps:

M = gamm(behav ~ s(age,id="age") + sex + education + s(age, by = sex,
id="age") + brain + s(brain, by = age), random = list(subjectID =
~1+age), data = data)

and perhaps repeating the same thing for brain. Note that id=...
argument forces the same age-related smoothers to have the same
smoothing parameter, i.e. the same amount of wigliness.

Like I said though, my understanding of and experience with GAMs is not
particularly extensive, so you can also check whether there is a
difference between s() and ti() in the overall model fit. In one example
in the book, basically equivalent models differ somewhat in their fit
because of differences in the implied penalty structure for the smooths
(p. 335).

Best,
Phillip

On 26/2/19 1:57 pm, Louise Baruël Johansen wrote:
> Dear Phillip,
> 
> Thank you for taking your time to look at my question.
> 
> Our data consists of up to 12 MRI scans per subject with interscan-intervals of 6 months, and the subjects were between the age of 7-13 years at baseline, which gives us a reasonable overlap between subjects. The brain data is extracted from regions of interest, and the behavioural data could be RT.
> 
> My question was more regarding how to incorporate the interaction effects in the most appropriate way statistically; by using te(), ti(), or in a completely different way?
> 
> All the best,
> 
> Louise
> 
>> On 22 Feb 2019, at 11.29, Phillip Alday <phillip.alday using mpi.nl> wrote:
>>
>> Hi Louise,
>>
>> I'm somewhat curious what brain imaging data you have that can be so
>> neatly summed up as a single univariate value. While you can do e.g. the
>> EEG voltage at a given timepoint in a given channel or the BOLD signal
>> in a given voxel or some overall structural score derived from DTI,
>> these are generally very poor indices of the structural and activity
>> variation within and between brains. I ask because knowing more about
>> your data helps when giving advice about a model. I'm guessing behavior
>> is something like RT or maybe d-prime/sensitivity index and *not* simple
>> accuracy, where a Gaussian model would not be appropriate.
>>
>> All that said, I do already have one comment/question ...
>>
>> Your data are longitudinal, but how much so? What's the range in age
>> within subjects vs. between subjects? If the range within subjects is
>> just a few months to a year or two and the range between subjects is
>> several years, as is common in many studies, then having a by-subject
>> slope for age doesn't really make much sense. The overall by-subjects
>> variation (the intercept, i.e. ~1) and residual variation will probably
>> dominate.
>>
>> And some general advice: use the various plotting functions (plot(),
>> vis.gam()), etc. to get an idea about what your model "thinks" the world
>> looks like and whether that matches your own expectations and
>> matches/fits the picture presented by the data.
>>
>> Best,
>> Phillip
>>
>>
>> On 20/2/19 10:01 am, Louise Baruël Johansen wrote:
>>> I have a question on how to model interaction terms including smooths in a GAMM model (using the mgcv and nlme packages in R).
>>>
>>> We have collected longitudinal behavioral and brain imaging data from ~100 subjects across ~6 time points, and I would like to model main effects of age, sex, brain as well as to-way interaction terms (and maybe three-way interaction terms), while correcting for education level and taking random effects into account.
>>>
>>> Is using the ti() setup the way to do this:
>>>
>>> M = gamm(behav ~ ti(age) + sex + education + ti(age, by = sex) + brain + ti(brain, by = age), random = list(subjectID = ~1+age), data = data)
>>>
>>>
>>> All help will be appreciated. 
>>>
>>> Thanks, Louise
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>