# [R] [Rd] Formulas in gam function of mgcv package

Wed Aug 26 14:53:53 CEST 2009

```Dear Simon,

thanks again.

Concerning the whole 36 variables .... well, I have run a principal components
analysis, and I am only using part of them (I am running a test with the pc
which cover the 95% of variance and then the 99%). :) .... so I will possibly
end up with s(x1,....,x8). I wonder if using isotropic smoothers on principal
component is a good idea .... the variance diminishes from component to
component, so theoretically also the wiggliness of the smoother should be less
and less .... what do you think? am I saying something stupid?

If that is the case, and if I want to enclose some interaction, then I have so
include the interaction terms manually .... like s(x1,x2). Is that right?

Sorry for the avalanche of questions, but I am trying to understand the
principles underlying the working of gam in mgcv. It looks very powerful,
particularly for exploring dependencies.

I have run te() instead of s(), but the predictive power seems to be less than
with s() in this particular situation. At the same time, does te() include the
interaction? I did not understand well your previous point on interaction term
in te(): is te(x1,....,xn) build as an expansion from the t(x1), .... ,t(xn)?
Then all the interaction terms should be included ....

Finally, is it possible to incorporate both s() and te() terms in the formula?

Machine learning: I am not too well versed in the area. Did you mean
regression trees or maximum entropy models?

Best,

On Wednesday 26 August 2009 10:27:08 Simon Wood wrote:
> This will not work...
>
> > 2) y~s(x1, .... ,x36)
>
> Estimating a 36 dimensional functions reasonably well would require a
> tremendous quantity of data, but in any case the 36 dimensional TPS
> smoothnes measure will involve such high order derivatives that it will no
> longer be practically useful: in fact you will not have enough data to
> estimate the unpenalized coefficients of the smoother (and if you did R
> would run out of memory first).
>
> In such a high dimensional situation, I think that GAMs are really only
> useful if you have some prior knowledge of which variables are likely to
> interact (and it's not too many of them). If there's no prior information
> saying roughly what sort of smooth additive structure might be useful then,
> I'm not sure that GAMs are the right way to go, and some sort of machine
> learning approach might be better.
>
> Then again, the real problem with
> y~s(x1, .... ,x36)
> is that the data just won't contain enough information to estimate s, if
> all you can say is that s is smooth, but this also means that it's very
> unlikely that you really need to estimate s(x1, .... ,x36) in order to
> predict well. In that case, starting from
> y ~ s(x1) + .... + s(x36)
> and building the model up might result in something that does a reasonable
> predictive job.
>
> On the subject of tensor product smoothing vs isotropic smoothing.
> Isotropic smooths are really only reasonable if you think  that the smooth
> should display approximately the same amount of wiggliness in all
> directions. If this is not the case then tensor product smoothing is a
> better bet. Centering and scaling alone is not enough to ensure that
> isotropy is reasonable (although in particular cases it may help, of
> course).
>
> best,
> Simon
>
> > I am trying to build a predictive model. Since the the variables are
> > centred and scaled, I think I need an isotropic smooth. I am also
> > interested in having the interactions between the variables included,
> > that is not a purely additive model.
> >
> > It is not clear to me when should I give preference to tensor smooths,
> > possibly because I have not understood well how they work.
> >
> > I am reading Wood(2003) as recommended and I have also read rather
> > extensively Simon N. Wood. Generalized Additive Models: An Introduction,
> > 2006, but still I am stuck. Any additional suggestion or reading
> > recommendation would be greatly appreciated.
> >
> > I have also some difficulties in understanding the values you have chosen
> > for k in the first example (why 60?).
> >
> > Thanks
> >
> > Best,
> >
> > On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
> > > [Note R-Devel is the wrong list for such questions. R-Help is where
> > > this should have been directed - redirected there now]
> > >
> > > On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
> > > > Dear R-experts,
> > > >
> > > > I have a question on the formulas used in the gam function of the
> > > > mgcv package.
> > > >
> > > > I am trying to understand the relationships between:
> > > >
> > > > y~s(x1)+s(x2)+s(x3)+s(x4)
> > > >
> > > > and
> > > >
> > > > y~s(x1,x2,x3,x4)
> > > >
> > > > Does the latter contain the former? what about the smoothers of all
> > > > interaction terms?
> > >
> > > I'm not 100% certain how this scales to smooths of more than 2
> > > variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
> > > Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths
> > > of 2 variables.
> > >
> > > Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
> > > used to produce the smoothers in the two models may not be the same in
> > > both models. One option to ensure nestedness is to fit the more
> > > complicated model as something like this:
> > >
> > > ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
> > > y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
> > >                                   ^^^^^^^^^^^^^^^^^
> > > where the last term (^^^ above) has the same k as used in s(x1, x2)
> > >
> > > Note that these are isotropic smooths; are x1 and x2 measured in the
> > > same units etc.? Tensor product smooths may be more appropriate if not,
> > > and if we specify the bases when fitting models s(x1) + s(x2) *is*
> > > strictly nested in te(x1, x2), eg.
> > >
> > > y ~ s(x1, bs = "cr", k = 10) + s(x2, bs = "cr", k = 10)
> > >
> > > is strictly nested within
> > >
> > > y ~ te(x1, x2, k = 10)
> > > ## is the same as y ~ te(x1, x2, bs = "cr", k = 10)
> > >
> > > [Note that bs = "cr" is the default basis in te() smooths, hence we
> > > don't need to specify it, and k = 10 refers to each individual smooth
> > > in the te().]
> > >
> > > HTH
> > >
> > > G
> > >
> > > > I have (tried to) read the manual pages of gam, formula.gam,
> > > > smooth.terms, linear.functional.terms but could not understand
> > > > properly.
> > > >
> > > > Regards

--