[R] question regarding GAM from a novice (in GAM as well as in R)

Mon Jul 14 12:14:26 CEST 2003

> 
> Can someone explain what some of the terms do in this model do:?
> 
> c<-gam(depvar~var1+var2+s(var3)+s(var4, by=var5)+s(var6, var7)+s(var8,3),
> data=xdataset ) 
> 
> I do not use the terms including var4- var8 in my model, just want to know
> what they do. 
> 
> +s(var4, by=var5)
- var5 is a variable multiplying this smooth of var4. i.e. the model is
something like:

E(depvar_i) = .... f(var4_i)var5_i + ... e_i

where f is a smooth function. Models like this are sometimes called
variable coefficient models (see Hastie and Tibhirani JRSSB 1993?)

> +s(var6, var7)
- A smooth function of two variables: var6 and var7 (you can, in
principle have smooths of any number of variables.)

> +s(var8,3)
old form of s(var8,k=3,bs="cr") uses a cubic regression spline basis with
3 knots to represent the smooth function of var8. Note that default k for
a 1-d smooth is 10, and default basis is "tp" - a thin plate regression
spline. The default basis is usually slightly better, and admits smooths
of several variables, but the "cr" basis is much quicker computatioanlly.

> Furthermore, the results become rather different when I change the model to:
> 
> c<-gam(depvar~var1+var2-1+s(var3)+s(var4, by=var5)+s(var6, var7)+s(var8,3),
> data=xdataset ) 

- *iff* var1 and var2 are not factors, then this is a model with no
intercept term, and the mean of the fitted values will be zero. Hence the
big change!

Simon
_____________________________________________________________________
> Simon Wood simon at stats.gla.ac.uk        www.stats.gla.ac.uk/~simon/
>>  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
>>>   Direct telephone: (0)141 330 4530          Fax: (0)141 330 4814