[R] GAM interactions, by example

Simon Wood s.wood at bath.ac.uk
Wed May 30 15:25:45 CEST 2012


Geraldine,

They really are the same fit, try...

range(fitted(b)-fitted(b1))
  [1] -3.333782e-10  4.173699e-10

... for example.

The edf differences are just down to differences in how identifiability 
constraints are handled in the two cases. For b1 the smooths of x2 do 
not have centring constraints applied (see ?gam.models) so each such 
smooth has one more degree of freedom than in b, where centering 
constraints are applied. If you look at the total edf for the two 
versions they are identical.

The explained deviance difference is down to what is counted as the 
"null deviance" when you have a `-1' in the model formula (it's the same 
behaviour as with r^2 in lm, for example)... arguably this behaviour is 
not so sensible, but it is consistent with other modelling functions...

best,
Simon



On 29/05/12 10:29, Mabille, Geraldine wrote:
> Dear all,
> I'm using the mgcv library by Simon Wood to fit gam models with interactions and I have been reading (and running) the "factor 'by' variable example"   given on the gam.models help page (see below, output from the two first models b, and b1).
> The example explains that both b and b1 fits are similar: "note that the preceding fit (here b) is the same as (b1)...."
> I agree with the idea that it "looks" the same but when I look at the results from both models (summary b and summary b1) I see that the results look in fact quite different (edf, and also deviance explained for example???)
> Are those two models (b and b1) really testing the same things??? If yes, why are the results so different between models???
> Thanks a lot if anyone can help with that...
> Geraldine
>
>
> dat<- gamSim(4)
>
> ## fit model...
> b<- gam(y ~ fac+s(x2,by=fac)+s(x0),data=dat)
> plot(b,pages=1)
> summary(b)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> y ~ fac + s(x2, by = fac) + s(x0)
>
> Parametric coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)   1.1784     0.1985   5.937 6.59e-09 ***
> fac2         -1.2148     0.2807  -4.329 1.92e-05 ***
> fac3          2.2012     0.2436   9.034<  2e-16 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Approximate significance of smooth terms:
>               edf Ref.df      F  p-value
> s(x2):fac1 5.364  6.472  2.285   0.0312 *
> s(x2):fac2 4.523  5.547 11.396 4.59e-11 ***
> s(x2):fac3 8.024  8.741 43.456<  2e-16 ***
> s(x0)      1.000  1.000  0.237   0.6269
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> R-sq.(adj) =  0.634   Deviance explained = 65.3%
> GCV score = 4.0288  Scale est. = 3.8082    n = 400
>
> ## note that the preceding fit is the same as....
> b1<-gam(y ~ s(x2,by=as.numeric(fac==1))+s(x2,by=as.numeric(fac==2))+
>              s(x2,by=as.numeric(fac==3))+s(x0)-1,data=dat)
> ## ... the `-1' is because the intercept is confounded with the
> ## *uncentred* smooths here.
> plot(b1,pages=1)
> summary(b1)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> y ~ s(x2, by = as.numeric(fac == 1)) + s(x2, by = as.numeric(fac ==
>      2)) + s(x2, by = as.numeric(fac == 3)) + s(x0) - 1
>
> Approximate significance of smooth terms:
>                               edf Ref.df       F  p-value
> s(x2):as.numeric(fac == 1) 6.341  7.447   6.214 3.38e-07 ***
> s(x2):as.numeric(fac == 2) 3.393  3.961  14.727 4.07e-11 ***
> s(x2):as.numeric(fac == 3) 9.015  9.737 104.760<  2e-16 ***
> s(x0)                      1.000  1.000   0.266    0.606
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> R-sq.(adj) =  0.631   Deviance explained =   75%
> GCV score = 4.0345  Scale est. = 3.8353    n = 400
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283



More information about the R-help mailing list