[R] gam()
Henric Nilsson
henric.nilsson at statisticon.se
Wed Jun 4 17:01:05 CEST 2003
Dear all,
I've now spent a couple of days trying to learn R and, in particular, the
gam() function, and I now have a few questions and reflections regarding
the latter. Maybe these things are implemented in some way that I'm not yet
aware of or have perhaps been decided by the R community to not be what's
wanted. Of course, my lack of complete theoretical understanding of what
mgcv really does may also show...
1. When fitting models where a factor interacts with a smooth term, say
y~a+s(x,by=a.1)+s(x,by=a.2), I noticed that the rug in the plot of each of
the smooth terms is identical. I expected the rug in the plot of e.g.
s(x,by=a.1) to only include those x for which a.1=1 to be able to judge if
observations of x where a.1=1 are sparse in any region. Also, it would be
really if nice the "by=..." was included in the output of the plot.gam()
and the "Approximate significance of smooth terms:" part of the summary.gam().
2. John Fox has modified anova.glm() into anova.gam()
(http://www.socsci.mcmaster.ca/jfox/Books/Companion/nonparametric-regression.txt)
for comparison of two or more fitted models based on the difference between
residual deviances. Indiscriminate use of such a procedure shouldn't
perhaps be encouraged, but I think that many users expect it to be part of
the mgcv package since this model selection idea is covered in several
texts and also implemented in S-plus (and may be OK for truly nested
models). And even if it's been decided that this functionality is not
wanted in mgcv, perhaps another function comparing several models by the
GCV/UBRE score and other useful statistics can be implemented?
3. Some authors [1, 2] suggests pointwise estimation of odds ratios and
corresponding confidence intervals based on the smooth terms in a GAM.
Maybe something for mgcv?
[1] Figueiras, A. & Cadarso-Suárez C. (2001) "Application of Nonparametric
Models for calculating Odds Ratios and Their Confidence Intervals for
Continuous Exposures", American Journal of Epidemiology, 154(3), 264-275.
[2] Saez, M., Cadarso-Suárez C. & Figueiras, A. (2003) "np.OR: an S-Plus
function for pointwise nonparametric estimation of odds-ratios of
continuous predictors", Computer Methods and Programs in Biomedicine, 71,
175-179.
4. For each purely parametric covariate a t-test is produced; I'd like to
have something like S-plus' anova.gam() to get an overall test. (Perhaps
with the addition of a choice between Type I and Type III tests, but I
guess that may be controversial). Is it possible?
//Henric
---------------------------------------------------------------------------------------
Henric Nilsson, Statistician
Statisticon AB, Östra Ågatan 31, SE-753 22 UPPSALA
Phone (Direct): +46 (0)18 18 22 37
Mobile: +46 (0)70 211 68 36
Fax: +46 (0)18 18 22 33
<http://www.statisticon.se>
More information about the R-help
mailing list