[R] [Rd] Formulas in gam function of mgcv package

Simon Wood s.wood at bath.ac.uk
Wed Aug 26 10:56:13 CEST 2009

```> > I am trying to understand the relationships between:
> >
> > y~s(x1)+s(x2)+s(x3)+s(x4)
> >
> > and
> >
> > y~s(x1,x2,x3,x4)
> >
> > Does the latter contain the former? what about the smoothers of all
> > interaction terms?
The first says that you want a model
E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
where the f_j are smooth functions. The additive decomposition is quite a
strong assumption, since it assumes that the effect of x_j is not dependent
on x_k unless j=k. The second model is just
E(y) = f(x_1,x_2,x_3,x4)                                          (2)
where f is a smooth function. This looks very general, but actually `s' terms
assume isotropic smoothness, which is also quite a strong assumption.

Now if I simply state that f and the f_j are `smooth functions', and leave it
at that, then (2) would of course contain (1), but to actually estimate the
models I need to state, mathematically, what I mean by `smooth'. Once I've
done that I've pretty much determined the function spaces in which f and the
f_j will lie, and in general (2) will no longer strictly contain (1). mgcv's
`s' terms use a thin plate spline measure of smoothness for multivariate
smooths, and this means that (1) will not be strictly nested within (2),
since e.g. a 4D thin plate spline can not generally represent exactly what
the sum of 4 1D splines can represent.

If you want to acheive exact nesting then using tensor product smooths with
something like

y~te(x1)+te(x2)+te(x3)+te(x4)   (3)

y~te(x1,x2,x3,x4)                         (4)

will do the trick (because the function space for (4) is built up from the
function spaces used in (3)).

As to where all the 2 and 3 way interactions have gone in (4)... it's just
like ANOVA - if you put in a 4 way interaction then the lower order
interactions are not identifiable, unless you choose to add constraints to
make them so. `mgcv' will allow you add main effects and interactions, and
will handle the constraints automatically, but if this sort of functional
ANOVA is a major component of what you want to do, then it is probably worth
checking out the gss package and Chong Gu's book on smoothing spline ANOVA.

best,
Simon

--
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283

```