[R] predicted values in mgcv gam

Wed Mar 8 11:49:28 CET 2006

Hi Denis,

Your first plot is of f(x) against x where \sum_i f(x_i) = 0 (x_i are
observed x's).

Your second plot is of \exp(\alpha + f(x)) against x where
\sum_i f(x_i)=0, and \alpha is an intercept parameter.

So the zero line on the first plot, corresponds to the \exp(\alpha) line
on the second plot (which is not the same as the mean of the response
data).

If you replace
> abline(h=mean.y,lty=5,col=grey(0.35))
by
> abline(h=coef(gam2)[1],lty=5,col=grey(0.35))
then everything should work.

best,
Simon

On Sun, 5 Mar 2006, Denis Chabot wrote:

> Hi,
>
> In fitting GAMs to assess environmental preferences, I use the part
> of the fit where the lower confidence interval is above zero as my
> criterion for positive association between the environmental variable
> and species abundance. However I like to plot this on the original
> scale of species abundance. To do so I extract the fit and SE using
> predict.gam.
>
> Lately I compared more carefully the plots I obtain in this way and
> those obtained with plot.gam and noticed differences which I do not
> understand.
>
> To avoid sending a large dataset I took an example from gam Help to
> illustrate this.
>
> Was I wrong to believe that the fit and its confidence band should
> behave the same way on both scales?
>
> Thanks in advance,
>
> Denis Chabot
> #######################
> library(mgcv)
> set.seed(0)
> n<-400
> sig<-2
> x0 <- runif(n, 0, 1)
> x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1)
> x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f <- f0(x0) + f1(x1) + f2(x2)
> g<-exp(f/4)
> y<-rpois(rep(1,n),g)
> mean.y <- mean(y)
>
> gam2 <- gam(y~ s(x2), poisson)
>
> # to plot on the response scale
> val.for.pred <- data.frame(x2=seq(min(x2), max(x2), length.out=100))
> pred.2.resp <- predict.gam(gam2, val.for.pred ,type="response",
> se.fit=TRUE)
> lower.band <- pred.2.resp$fit - 2*pred.2.resp$se.fit
> upper.band <- pred.2.resp$fit + 2*pred.2.resp$se.fit
> pred.2.resp <- data.frame(val.for.pred, pred.2.resp, lower.band,
> upper.band)
>
> # same thing on term scale
> pred.2.term <- predict.gam(gam2, val.for.pred ,type="terms",
> se.fit=TRUE)
> lower.band <- pred.2.term$fit - 2*pred.2.term$se.fit
> upper.band <- pred.2.term$fit + 2*pred.2.term$se.fit
> pred.2.term <- data.frame(val.for.pred, pred.2.term, lower.band,
> upper.band)
>
> # it is easier to compare two plots instead of looking at these two
> data.frames
>
> plot(gam2, residuals=T, pch=1, cex=0.7)
> abline(h=0)
>
> plot(y~x2, col=grey(0.5))
> lines(fit~x2, col="blue", data=pred.2.resp)
> lines(lower.band~x2, col="red", lty=2, data=pred.2.resp)
> lines(upper.band~x2, col="red", lty=2, data=pred.2.resp)
> abline(h=mean.y,lty=5,col=grey(0.35))
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>