[R] check.k function in mgcv packages

Simon Wood s.wood at bath.ac.uk
Thu Jun 21 18:17:08 CEST 2012


The point is that you are checking the basis dimension used in the first 
model, b, where the basis dimension for s(x2) was set to 6. All the 
other model fits are about checking that first one. On checking the 
residuals from model b you detect pattern with respect to x2, with an 
estimated degrees of freedom of 9, which is bigger than the maximum 
possible employed in model b. So model b is probably using too small a 
basis dimension for s(x2).

best,
Simon

On 06/21/2012 02:07 AM, ywh123 wrote:
> Hi,everyone,
> I am studying the generalized additive model and employ the package 'mgcv'
> developed by professor Wood.
> However,I can not understand the example listed in check.in function.
> For example,
>
>
> library(mgcv)
> set.seed(1)
> dat<- gamSim(1,n=400,scale=2)
>
> ## fit a GAM with quite low `k'
> b<-gam(y~s(x0,k=6)+s(x1,k=6)+s(x2,k=6)+s(x3,k=6),data=dat)
> plot(b,pages=1,residuals=TRUE) ## hint of a problem in s(x2)
>
> ## the following suggests a problem with s(x2)
> gam.check(b)
>
> ## Another approach (see below for more obvious method)....
> ## check for residual pattern, removeable by increasing `k'
> ## typically `k', below, chould be substantially larger than
> ## the original, `k' but certainly less than n/2.
> ## Note use of cheap "cs" shrinkage smoothers, and gamma=1.4
> ## to reduce chance of overfitting...
> rsd<- residuals(b)
> gam(rsd~s(x0,k=40,bs="cs"),gamma=1.4,data=dat) ## fine
> gam(rsd~s(x1,k=40,bs="cs"),gamma=1.4,data=dat) ## fine
> /gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low/
> gam(rsd~s(x3,k=40,bs="cs"),gamma=1.4,data=dat) ## fine
>
> why the model is not good for x2?
>
>> gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low
> Family: gaussian
> Link function: identity
>
> Formula:
> rsd ~ s(x2, k = 40, bs = "cs")
>
> Estimated degrees of freedom:
> 9.0093  total = 10.00926
>
> GCV score: 4.494652
>
> For the results,we can see that the EDF is much less than K-1,so according
> to
> "If the effective degrees of freedom for a model term are estimated to be
> much less than k-1 then this is unlikely to be very worthwhile",I think the
> results are reasonable.
>
> Why?
>
> Thanks in advance
> wanhai
>
> --
> View this message in context: http://r.789695.n4.nabble.com/check-k-function-in-mgcv-packages-tp4634050.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list