[R] Comparing GAM objects using ANOVA

Simon Wood snw at mcs.st-and.ac.uk
Fri Nov 15 12:22:44 CET 2002


> 
> Is it possible to compare two GAM objects created with the gam() function 
> from the mgcv package. I use a slightly modified version of
> anova.glm() named anova.gam(), modified from John Fox (2002). It often
> gives me some aberant responses, especially with "F" test. I use a
> quasibinomial model and scale (dispersion) is calculated and used in the
> calculation of the F value. Does someone already tried this or does
> someone knows if all this is theoretically possible ?
- I'm slightly uncomfortable about trying to do this if you are selecting
the degree of smoothing by generalized cross validation (GCV) or unbiased
risk estimation (UBRE) which is the mgcv default. Both of these are
basically mean square error criteria, and it seems to me more logically
consistent to compare models by comparing their GCV or UBRE scores as
appropriate. If you want to model select by hypothesis testing then it's
possibly better to set up your GAMs using pure regression splines, rather
than the penalized regression splines that are the mgcv default. mgcv
allows you to do this. For example:

mod.1<-gam(y~s(v,u,k=21,fx=TRUE)+s(x,k=6,fx=TRUE))

fits a model involving a 20 df smooth of u and v and a 5 df smooth of x
(k is the basis dimension of the smooth, but you lose a df through GAM
identifiability constraints). The fitted model here is an un-penalized
GLM, so standard distributional results for GLMs hold. The basis use
maintians nested-ness of models, thereby allowing use of analysis of
deviance/variance. For example:

mod.0<-gam(y~s(v,u,k=11,fx=TRUE)+s(x,k=4,fx=TRUE))

is strictly nested within mod.1. The nesting is achieved by using a
carefully chosen "optimal" basis for each smooth, based on optimal low
rank approximation of thin plate splines: details out early next year in
JRSSB, but I can send you a pre-print if you are interested. 

If you really must mix hypothesis testing with MSE model selection then
I'd be inclined to use the very approximate p-values for terms reported by
summary.gam() - but please read the warnings in the help file first!

Simon
  ______________________________________________________________________
> Simon Wood  snw at st-and.ac.uk  http://www.ruwpa.st-and.ac.uk/simon.html
> CREEM, The Observatory, Buchanan Gardens, St Andrews, Fife KY16 9LZ UK
> Direct telephone: (0)1334 461844          Indirect fax: (0)1334 463748 


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list