[R] Smooth terms significance in GAM models
Simon Wood
sw283 at maths.bath.ac.uk
Fri Sep 30 15:40:42 CEST 2005
> i'm using gam() function from package mgcv with default option (edf
> estimated by GCV).
>
> >G=gam(y ~ s(x0, k = 5) + s(x1) + s(x2, k = 3))
> >SG=summary(G)
> Formula:
> y ~ +s(x0, k = 5) + s(x1) + s(x2, k = 3)
>
> Parametric coefficients:
> Estimate std. err. t ratio Pr(>|t|)
> (Intercept) 3.462e+07 1.965e+05 176.2 < 2.22e-16
>
> Approximate significance of smooth terms:
> edf chi.sq p-value
> s(x0) 2.858 70.629 1.3129e-07
> s(x1) 8.922 390.39 2.6545e-13
> s(x2) 1.571 141.6 1.8150e-11
>
> R-sq.(adj) = 0.955 Deviance explained = 97%
> GCV score = 2.4081e+12 Scale est. = 1.5441e+12 n = 40
> --------------------------------------
>
> I know i can estimate the significance of smooth terms with chi.sq &
> p.value.
>
> With GCV, p-value are obtained by comparing the statistic to an F
> distribution,isn't it?
> help(summary.gam) says "use at your own risk!".Does it mean i should
> only estimated signifiance of smooth terms by chi.sq?.Is there a way to
> link both information (p.value and chi.sq)?
No, using F as the reference distribution is always more conservative:
using chi.sq will be even worse. The p values are *very approximate* since
they are based on pretending that a penalized fit is equivalent to an
unpenalized fit with the same effective degrees of freedom, and neglect
the uncertainty associated with smoothing parameter estimation... they
provide a reasonable `rough guide' to significance, but are by no means
exact.
> Last question, using GAM with default, should i look at R-sq rather than
> Deviance explain, or both?
In this case devaince explained is just the unadjusted r^2... I'd look at
the r^2, which is adjusted (to take into account the degrees of freedom
`used up' when estimating the model).
best,
Simon
More information about the R-help
mailing list