[R] Why there is no p-value from likelihood ratio test using anova in GAM model fitting?

Tue Apr 28 17:48:32 CEST 2009

The issue isn't really about which order you supply the models to `anova'. The 
problem is that there is no meaningful test to perform with these two models, 
because the  `larger' model has actually been estimated as having a *larger* 
deviance than  the `smaller' model, so there is never going to be any basis 
for preferring the larger model. 

The problem arises in part because these tests are only ever approximate:  
estimation is by penalized likelihood, but the estimates are treated as being  
approximately MLEs. In addition the test is conditional on the smoothing 
parameters, which have actually been estimated (although this is no worse 
than doing any other sort of model selection prior to testing). 

The cleanest solution is perhaps to compare models by (generalized) AIC or 
similar, but if testing is important in this problem then there are a couple 
of alternatives:

1. You  can force your model 2 (the one with SES) to use the same smoothing 
parameters as model 1. (Extract the `sp' vector from the model 1 object and 
feed it into the `sp' argument of `gam' when fitting model 2). This restores 
proper nesting of null and alternative, and improves the test approximations. 

2. You could use the single argument version of `anova' to test whether the 
`SES' term is significantly different from zero. 

However, the message from the fits you have already done is that `SES' is 
doing nothing that can't better be done by the other covariates. 

best,
Simon

On Tuesday 28 April 2009 15:46, willow1980 wrote:
> Hi, Simon,
> I am using mgcv:gam and the version number is mgcv_1.5-2. I also exchanged
> the order of two models in
> anova, but this also did not help.
>
> >From the differences in DF(0.77246) and deviance (-0.02), these two models
>
> seem to be not significantly different. Isn't it?
> Thank you anyway!
>
> Simon Wood-4 wrote:
> > The simpler model has the lower deviance (marginally), so there is
> > nothing to
> > test here. This can happen with maximum penalized likelihood estimators,
> > even
> > though the models are nested (especially if the smoothing parameters are
> > selected automatically). Are you using gam:gam or mgcv:gam (and which
> > version
> > numbers)?
> >
> > best,
> > Simon
> >
> > On Tuesday 28 April 2009 12:38, willow1980 wrote:
> >> Hello, everybody,
> >> There is the first time for me to post a question, because I really
> >> cannot
> >> find answer from books, websites or my colleagues. Thank you in advance
> >> for
> >> your help!
> >> I am running likelihood ratio test to find if the simpler model is not
> >> significant from more complicated model. However, when I run LRT to
> >> compare
> >> them, the test did not return F value and p-value for me. What's the
> >> reason? How can I get such important information?
> >>
> >> ####################################################
> >> Analysis of Deviance Table
> >>
> >> Model 1: sum_surv15 ~ s(FLBS) + s(byear) + s(FLBS, byear)
> >> Model 2: sum_surv15 ~ s(FLBS) + SES + s(byear) + s(FLBS, byear)
> >>    Resid. Df Resid. Dev         Df Deviance F Pr(>F)
> >> 1 1202.21094     601.27
> >> 2 1201.43848     601.29    0.77246    -0.02
> >> ####################################################
> >> Thank you very much!
> >>
> >> Jianghua Liu, University of Sheffield
> >
> > --
> >
> >> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> >> +44 1225 386603  www.maths.bath.ac.uk/~sw283
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283