[R] check.k function in mgcv packages

ywh123 523541975 at qq.com
Thu Jun 21 03:07:22 CEST 2012


Hi,everyone,
I am studying the generalized additive model and employ the package 'mgcv'
developed by professor Wood.
However,I can not understand the example listed in check.in function.
For example,


library(mgcv)
set.seed(1) 
dat <- gamSim(1,n=400,scale=2)

## fit a GAM with quite low `k'
b<-gam(y~s(x0,k=6)+s(x1,k=6)+s(x2,k=6)+s(x3,k=6),data=dat)
plot(b,pages=1,residuals=TRUE) ## hint of a problem in s(x2)

## the following suggests a problem with s(x2)
gam.check(b)

## Another approach (see below for more obvious method)....
## check for residual pattern, removeable by increasing `k'
## typically `k', below, chould be substantially larger than 
## the original, `k' but certainly less than n/2.
## Note use of cheap "cs" shrinkage smoothers, and gamma=1.4
## to reduce chance of overfitting...
rsd <- residuals(b)
gam(rsd~s(x0,k=40,bs="cs"),gamma=1.4,data=dat) ## fine
gam(rsd~s(x1,k=40,bs="cs"),gamma=1.4,data=dat) ## fine
/gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low/
gam(rsd~s(x3,k=40,bs="cs"),gamma=1.4,data=dat) ## fine

why the model is not good for x2?

> gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low

Family: gaussian 
Link function: identity 

Formula:
rsd ~ s(x2, k = 40, bs = "cs")

Estimated degrees of freedom:
9.0093  total = 10.00926 

GCV score: 4.494652

For the results,we can see that the EDF is much less than K-1,so according
to 
"If the effective degrees of freedom for a model term are estimated to be
much less than k-1 then this is unlikely to be very worthwhile",I think the
results are reasonable.

Why?

Thanks in advance
wanhai

--
View this message in context: http://r.789695.n4.nabble.com/check-k-function-in-mgcv-packages-tp4634050.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list