[R-sig-eco] gam variable selection

Gavin Simpson gavin.simpson at ucl.ac.uk
Tue Sep 27 11:40:27 CEST 2011


On Tue, 2011-09-27 at 08:54 +0200, Marco Helbich wrote:
> Dear list,
> 
> I am studying the influence of several environmental factors (numeric &
> dummies) on species densities (= numeric) using the gam()
> function with a gaussian link function in the mgcv package. As stated in 
> Wood (2006) there is no variable selection algorithm.
> 
> Is it an appropriate (iterative) approach to drop the predictor being
> least significant (eg. p > 0.05), refit the model, compare the GCV/AIC
> score and so forth. Should I first focus on the smoothing functions or 
> fixed effects? Or is such a distinction not important at all?
> 
> Perhaps someone has more experience with GAMs and can give me a helping
> hand? Thanks in advance!

You could do that, but I would be sceptical of the results.

Marra and Wood (2011, Computational Statistics and Data Analysis 55;
2372-2387) compare various approaches for feature selection in GAMs.
IIRC, they concluded that an additional penalty term in the smoothness
selection procedure gave the best results. This can be activated in
mgcv::gam() by using the `select = TRUE` argument/setting.

HTH

G

> Best
> Marco

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-sig-ecology mailing list