[R] interactions and GAM

Simon Wood s.wood at bath.ac.uk
Thu Mar 1 13:51:16 CET 2007


Don't know if this helps, but... 
gam in package mgcv will let you set up smooths that interact with factors 
using the `by' variable mechanism. See ?gam.models, particularly the last 
example. Prediction is supported.

Simon

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283 

On Tuesday 27 February 2007 20:10, Beaulaton Laurent wrote:
> Dear R-users,
>
> I have 1 remark and 1 question on the inclusion of interactions in the gam
> function from the gam package.
>
> I need to fit quantitative predictors in interactions with factors. You can
> see an example of what I need in fig 9.13 p265  from Hastie and Tibshirani
> book (1990). It's clearly stated that in ?gam  "Interactions with
> nonparametric smooth terms are not fully supported". I have found a trick
> in a former http://www.math.yorku.ca/Who/Faculty/Monette/S-news/2284.html,
> using NAs and na.gam.replace argument, but some points are still unclear
> for me.
>
> First the prediction of new data (using predict function) is not so easy
> (see script below), and need a close reading from section 7.3.2 of the
> Chambers and Hastie (1992).
>
> Second I need to have the same intercept for all levels of factor and this
> not achievable with this trick. My question is : why not replacing NA by 0
> (or any other particular value) ?
>
> Here is a quite long (sorry for that) script with a generated dataset to
> better undestand my question. in this script the model to fit is (in a
> GLM-like writing) : y~s(x2):x1 the generated dataset follows this model and
> y(x2=0)=10 whatever x1.
>
> ########################
> #start of script
> ########################
>
> #data construction  (with deliberately very small noise)
> data1=data.frame(x1=rep(NA,27),x2=NA,y=NA)
>
> data1$x1=factor(c(rep(1,11),rep(2,11),rep(3,5)))
> data1$x2=c(rep(0:10,2),0:4)
>
> data1[data1$x1==1,"y"]=data1[data1$x1==1,"x2"]^4*5+rnorm(11)+10000
> data1[data1$x1==2,"y"]=data1[data1$x1==2,"x2"]^4*(-3)+rnorm(11)+10000
> data1[data1$x1==3,"y"]=10000*data1[data1$x1==3,"x2"]+rnorm(5)+10000
>
> library(lattice)
> xyplot(data1$y~data1$x2,groups=data1$x1)
>
> #creation of dummy variables for interactions
> data1$x2_1=ifelse(data1$x1=="1",data1$x2,NA)
> data1$x2_2=ifelse(data1$x1=="2",data1$x2,NA)
> data1$x2_3=ifelse(data1$x1=="3",data1$x2,NA)
>
> #model fitting
> library(gam)
> model=gam(y~s(x2_1)+s(x2_2)+s(x2_3)+x1,data=data1,na=na.gam.replace)
>
> #prediction fit well data :
> summary(model)
> plot(data1$x2,data1$y)
> points(data1$x2,model$fitted.value,col="red",pch="+")
>
> #trying to see prediction:
> predict(model) #does work
> predict(model,newdata=data1) #produce NA
>
> #trying to replace NA in data1 by mean, to mimic na.gam.replace:
> Ndata=data1
> Ndata$x2_1=ifelse(data1$x1=="1",data1$x2,mean(data1$x2_1,na.rm=TRUE))
> Ndata$x2_2=ifelse(data1$x1=="2",data1$x2,mean(data1$x2_2,na.rm=TRUE))
> Ndata$x2_3=ifelse(data1$x1=="3",data1$x2,mean(data1$x2_3,na.rm=TRUE))
>
> predict(model,Ndata)-predict(model) #as you can see there is a systematic
> biais
>
> #correct way to predict (=returned 0 for terms with NA value):
> p=predict(model,data1,type="term")
> rowSums(cbind(p,attr(p,"constant")),na.rm=TRUE)-predict(model)
>
> #alternative solution, 0 instead of NA
> data1$v1=ifelse(data1$x1=="1",data1$x2,0)
> data1$v2=ifelse(data1$x1=="2",data1$x2,0)
> data1$v3=ifelse(data1$x1=="3",data1$x2,0)
>
> model1=gam(y~s(v1)+s(v2)+s(v3),data=data1)
> summary(model1)
> points(data1$x2,predict(model1,data1),col="green",pch="X")
> #no particular problem with predict function
>
> #what's happened in x2=0 ?
> predict(model)[data1$x2==0]
> predict(model1)[data1$x2==0]
>
> ########################
> #end of script
> ########################
>
> thanks in advance
> best regards
> Laurent Beaulaton
>
> ---------------------------------------------
> Laurent Beaulaton
> ###############################
> # NEW !!!!                                       #
> #  http://www.laurent-beaulaton.fr/    #
> # Tel + 33 (0)5 57 89 27 17             #
> ###############################
> ---------------------------------------------
> Cemagref (French Institute of Agricultural and Environmental Engineering
> Research ) Unité "Ecosystèmes estuariens et poissons migrateurs
> amphihalins"
> (anciennement Unité "Ressources aquatiques continentales")
> 50 avenue de Verdun
> F 33612 Cestas Cedex
>
> Tel + 33 (0)5 57 89 27 17
> Fax + 33 (0)5 57 89 08 01
> mailto:laurent.beaulaton at bordeaux.cemagref.fr
>
> http://www.laurent-beaulaton.fr/
> http://www.bordeaux.cemagref.fr/rabx/
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented, minimal,
> self-contained, reproducible code.



More information about the R-help mailing list