[R] mgcv, should include a intercept for the 'by' varying coefficient model, which is unconstrained
Xing Zhao
zhaoxing at uw.edu
Tue Mar 18 00:39:02 CET 2014
Dear Dr. Wood and other mgcv experts
In ?gam.models, it says that the numeric "by" variable is genrally not
subjected to an identifiability constraint, and I used the example in
?gam.models, finding some differences (code below).
I think the the problem might become serious when several varying
coefficient terms are specified in one model, such as gam(y ~
s(x0,by=x1) + s(x0,by=x2) + s(x0,by=x3),data=dat). In this case, those
three terms are all not constraint, as they generally will not meet
the three conditions for constraint.
I can still implement it like gam(y ~ s(x0,by=x1) + s(x0,by=x2) +
s(x0,by=x3),data=dat), but is it safe? Is there a best way to
implement the model?
Thank you for your help
Best,
Xing
require(mgcv)
set.seed(10)
## simulate date from y = f(x2)*x1 + error
dat <- gamSim(3,n=400)
b<-gam(y ~ s(x2,by=x1),data=dat)
b1<-gam(y ~ s(x2,by=x1)-1,data=dat)
> range(fitted(b)-fitted(b1))
[1] -0.13027648 0.08117196
> summary(dat$f-fitted(b))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.5265 0.2628 1.2290 1.7710 2.6280 8.8580
> summary(dat$f-fitted(b1))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.4618 0.2785 1.2250 1.7390 2.5370 8.7310
> summary(dat$y-fitted(b))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-6.23500 -1.32700 -0.06752 0.00000 1.54900 7.01800
> summary(dat$y-fitted(b1))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-6.26700 -1.40300 -0.09908 -0.03199 1.51900 6.96700
More information about the R-help
mailing list