[R-sig-eco] bootstrapping and 2-dimensional smoothers in GAMs

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Jan 6 10:28:40 CET 2011


On Tue, 2010-12-28 at 13:05 +0100, Saskia Otto wrote:
> Dear all,
> 
> I'm currently doing some GAM analysis and would like to include 2-way  
> interactions between my continuous, explanatory variables. I know that  
> for this, I need to include 2-dimensional smoothers but I'm not sure  
> how to include them.
> For linear models I learned that I should always include the variables  
> that are in the interaction term as main terms, i.e. y = b0 + b1*x +  
> b2*z + b3*xz (usually everybody cites Underwood, 1997)
> 
> How is it with GAMs if I want to include a 2-dimensional smoother for  
> interactions? In some papers I see the same approach, i.e. y = b0 +  
> s1(x) + s2(z) + s3(x,z). But in some papers I see only models like  
> this: y = a + s(x,z), i.e. only the 2-dimensional smoother. Is there a  
> general rule with GAMs?

(Sorry for coming to this late - catching up from the Christmas break)

I depends what you want to do with the models, and how strict you want
to be about nesting them. Technically, to do model comparison using
likelihood ratio tests or AIC, the models compared should be nested one
within the other; i.e. you can go from the more complex model to the
simpler model by setting one or more coefficients to zero in the more
complex model.

Firstly, s(x,y) produces a 2-d thin plate spline smoother of variables x
and y. This assumes equal smoothing and the same basis in each of x and
y. If x and y are not in the same units or are on different scales or
have different degrees of smoothness, tensor product smooths are ideal,
using te(x, y).

With the two models:

mod1 <- gam(y ~ s(x1) + s(x2))
mod2 <- gam(y ~ s(x1, x2))

(Note the intercept is implicit above.) We might want to test if mod2,
the more complex model, is an improvement over mod1. As I mentioned
above, these two models are not strictly nested. Simon Wood's GAM book
suggests that mod2 be written as:

mod2 <- gam(y ~ s(x1) + s(x2) + s(x1, x2))

However, the individual terms (might?) use a different basis when
included on their own to the one used to represent them in the s(x1, x2)
term. So again, the above is better, but still not quite strictly
nested.

Simon Wood, in newer material on his website (newer than his book),
suggests the following models are strictly nested:

mod1 <- gam(y ~ te(x1) + te(x2))
mod2 <- gam(y ~ te(x1) + te(x2) + te(x1, x2))

Note that the default basis here is a cubic regression spline basis
where as the default for s() terms is a thin-plate spline basis; you can
change this though, see ?te

> Also, some of my smoothers have in the models p-values between 0.01  
> and 0.05. Since p-values in this range should not be trusted it is  
> always advised to do bootstrapping to get better p-values. Does  
> anybody know how to do that?

See Simon Wood's book or papers cited in ?gam for details on coverage
properties of confidence intervals and p values. Yes, p values are only
approximate in the GAM case, but they are also approximate in the GLM
case, just less so. So we must take care when close to 0.05.

Bootstrapping is difficult for GAMs. See Slide 5 of:

http://people.bath.ac.uk/sw283/mgcv/tampere/mgcv-advanced.pdf

The idea of parametric bootstrapping for various lambda_i is covered on
page 261 of Simon Wood's GAM book, but as Slide 5 says, this generally
doesn't do too much, and is quite some effort.

Simon has suggested to me in the past that if you worry about p-values,
then REML or ML fitting gives better coverage properties, so set method
= "ML" or method = "REML" in you call to gam().

HTH

G

> My model looks like this:
> 
> GAM1 <- gam(cop  ~ s(Temp) + s(Sal) + s(Chl a), data = daten.sub)
> 
> 
> Thanks for your help and a happy new year!!
> 
> Saskia Otto
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-sig-ecology mailing list