[R-sig-ME] A question on setting up a generalized additive mixed effect model

Phillip Alday phillip.alday at mpi.nl
Wed Aug 9 16:56:52 CEST 2017

Hi Leon,

I agree that it makes sense to have both a by-subject intercept and
slope, but you will needs lots of data to estimate that for GAMM because
the smoothers "eat" a lot of data. Based on your model structure and
error message, I'm guessing that you have around 20 subjects or less.
Both for computational and inferential purposes, you need closer to 40
or preferably 100+ subjects for this study. See e.g. Button et al. 2013
(Nature Neuroscience) or Button's more recent paper in eNeuro. I know
that's a lot for scanner data, especially of infants. :(

Two other quick notes:

1. don't name your fitted model "gam" as this then shadows the function
"gam()" and can lead to a world of pain full of subtle bugs and weird
error messages.

2. You can also fit your model with a more mixed-model like syntax (e.g.
1|subjeIndex ) via the package gamm4. It turns out that you can express
random effects as smoothers (the mgcv approach) or smoothers as random
effects (the gamm4) approach. Depending on your exact model structure,
one or the other may be faster. See


for more info.


On 04/04/2017 12:18 AM, Leon Lee wrote:
> Dear R experts
> I am new to R & generalized additive models and wonder whether I could get
> some help from you all. The question I have is as follows:
> I have 30 subjects with each subject being scanned one to three times in
> the first year of life.
> The brain volume (BrainVolume) from each scan was measured.
> The scan time was randomly distributed from birth to 1 year, indexed by
> subjIndexF. i.e., first three scans are from the same subject, the fourth
> is from the second, subjIndexF=1,1,1,2...
> Each subject has chronological age (age) from birth to 1 year old.
> Now, I want to look at how predictors, such as subject's age will explain
> the changes in brain volume. I also want to model both random slope and
> intercept for random effects within each subject in the model. My model
> ends up like this:
> gam=gam(brainVolume~ s(age) + s(subjIndexF, bs=“re”) +  s(subjIndexF, age,
> bs="re"), method="REML", data=mydata)
> In which, s(subjeIndexF, bs=“re”) is for modeling random intercepts and
> s(subjIndexF, age, bs=“re”) is for modeling different slopes. When I tried
> to run the model, I was given a “coefficients more than the data” error. So
> my questions are as follows:
> (1) Does this model make sense, especially the part dealing with the
> repeated measures within subjects as random effects?
> (2) If it does, what I can do to reduce the required parameters? The model
> runs if I only model random intercepts without interaction term, but a more
> realistic scenario would be each subject has random slope for smooths as
> well.
> Your help will be greatly appreciated!
> I set up the model by raining following the suggestions in the following
> two links:
> http://www.sfs.uni-tuebingen.de/~jvanrij/Tutorial/GAMM.html
> http://r.789695.n4.nabble.com/Random-effects-in-package-mgcv-td4720162.html
> 	[[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list