[R] AIC for comparing GLM(M) with (GAM(M)

Wed Jan 27 11:35:14 CET 2010

This is tricky.

gamm4 is basing AIC on the (approximate) marginal likelihood with the smooth 
terms treated as random effects and integrated out. The appropriate degrees 
of freedom are then the number of smoothing parameters + the number of fixed 
effects. This is completely conventional AIC.

gam is basing AIC on a likelihood in which  the smooths are treated as if they 
were fixed effects but with degrees of freedom reduced to allow for the fact 
that smoothing penalties have been applied. This is an approximate AIC as 
suggested in Hastie and Tibshirani's  1990 GAM book p158 (there's a 
derivation in Wood, 2008 --- see ref list in ?gam). 

At the computational level it's hard to argue that one is right and the other 
wrong. The gamm4 version is just the AIC for a mixed model. For the gam 
version, one can almost always find an unpenalized model with approximately 
the degrees of freedom of the penalized fit and very nearly the same fitted 
values, resulting therefore, in a very similar AIC. (see example code at end)

So the models differ only in how we choose to view the smooths. For `gam' they 
are viewed as penalized fixed effects. For `gamm4' they are viewed as random 
effects *but we don't really believe this model*. Nobody believes that if we 
gathered a second replicate of the data that the best fit functions would be 
some random draw from the assumed marginal distribution of the smooths. We 
believe that the best fit functions would look much the same as they did for 
the original replicate. So the justification for using the mixed modelling 
approach is really Bayesian: the random effect distributions for the smooths 
are *really* priors, it's just that we can do all the computations using 
mixed model technology. But if we are really being Bayesian then AIC becomes 
a bit problematic....

.... The upshot of this is that the AIC produced by gamm4 ought to be ok for 
comparing different models fitted by `gamm4'. But you would probably want to 
be careful about using it for any other purpose *unless you believe that your 
smooths are really random effects* i.e. unless  you'd expect to get a 
different smooth each time your data were replicated (albeit with the same 
degree of smoothness each time).  

So with the functions as they are at present this makes AIC comparison of gam 
and gamm4 models a bit awkward. You can either calculate an AIC for `gam' 
that is equivalent to the `gamm4' version, by using the reported marginal 
likelihood (ML score) from the  `gam(...., method="ML")'  fit and the degrees 
of freedom given by the number of smoothing parameters + the number of 
unpenalized coefficients, or you can calculate the `gam' style AIC for the 
`gamm4' fit by using the fitted values from that fit, together with e.g. 
binomial()$aic and the effective degrees of freedom reported for the fit. 

best,
Simon

## penalized versus unpenalized AIC....
set.seed(0)
dat <- gamSim(1,n=400,dist="normal",scale=2)
b<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
b

Family: gaussian
Link function: identity

Formula:
y ~ s(x0) + s(x1) + s(x2) + s(x3)

Estimated degrees of freedom:
5.1727 2.3571 8.5174 1.0000  total = 18.04718

GCV score: 4.610979

b2<-gam(y~s(x0,k=6,fx=T)+s(x1,k=3,fx=T)+s(x2,k=10,fx=T)+x3,data=dat)
AIC(b)
[1] 1747.687
AIC(b2)
[1] 1748.820
cor(fitted(b),fitted(b2))
[1] 0.9988816

On Tuesday 26 January 2010 11:16, Andrea Meyer wrote:
> Hello
>
> I'm analyzing a dichotomous dependent variable (dv) with more than 100
> measurements (within-subjects variable: hours24) per subject and more
> than 100 subjects. The high number of measurements allows me to model
> more complex temporal trends.
> I would like to compare different models using GLM, GLMM, GAM and
> GAMM, basically do demonstrate the added value of GAMs/GAMMs relative
> to GLMs/GLMMs, by fitting splines. GLMMs/GAMMs are used to possibly
> improve fits from GLMs/GAMs by accounting for serial dependence.
> My idea is to use AIC to compare the different models. I’ve noticed
> that when setting up two seemingly identical models using the two
> functions gam (of the package mgcv) and gamm4 (of the package with
>
> same name), the AIC turns out to be different:
>  > gam.0<-gam(dv ~
>
> s(hours24,fx=F,k=-1,bs=“cc“),method="ML",data=sdata, family=binomial)
>
>  > gamm.0<-gamm4(dv ~
>
> s(hours24,fx=F,k=-1,bs=“cc“),method="ML",data=sdata, family=binomial)
>
> Fit indices using the commands as shown are:
>  > logLik(gam.0)[1];deviance(gam.0);AIC(gam.0)
>  > logLik(gamm.0$mer);deviance(gamm.0$mer);attributes(summary(gamm.
>
> 0$mer))$AICtab[1]
>
> gam.0: logLik=1149.6, deviance=2299.3, AIC=2316.0
> gamm.0: logLik=1169.0, deviance=2338.0, AIC=2342.0
>
> The differences between the two AIC values seem to be based on two
> factors. First, gam uses the effective degrees of freedom
>
>  > sum(gam.0$edf)
>
> [1] 8.372517
> whereas gamm4 uses the value 2. Second the two log-likelihood values
> already differ, probably because different estimation methods are used
> but here is were my understanding ends. In any case from gamm4 I can
> get the same value for the deviance as for gam by referring to the
>
> deviance slot:
>  > gamm.0$mer at deviance["disc"], which returns the value 2299.3, which
>
> is the deviance without compensation for the null deviance.
>
> My questions are:
> - Is my suggested method of comparing fits among GLM, GLMM, GAM and
> GAMM using AIC legitimate? Of course I will do additional model
> plotting using residuals etc. as well but it seems important to me to
> have a more direct method of comparing these models (I’m aware of the
> fact that AIC is a rough estimate when it comes to generalized mixed
> models).
> - If so, how can I compute the AICs using gam and gamm4 such that they
> can be compared A) with each other and B) with AICs obtained from GLM/
> GLMM?
>
> Any suggestions are welcome
>
> Andrea
>
>
>
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283