[R] Safe prediction does not work for bivariate polynomial terms?
John Fox
jfox at mcmaster.ca
Sat Jan 25 16:18:25 CET 2014
Dear Brian and Xing,
Sandy Weisberg and I coincidentally encountered this issue yesterday, and I intended to send a message about it to the r-help list. We thought initially that the functions we were working on were bugged, but eventually discovered that polym() wasn't saving basis information for safe prediction with data-dependent polynomial bases when we examined the objects that it produced.
I think that the inability of polym() to participate in safe predictions is a kind of trap for users, which could be avoided by mentioning the issue more directly in the help page for poly() and polym(). It's true that at present the help page states that only in the univariate case does poly() return an object that includes "the centering and normalization constants used in constructing the orthogonal polynomials." So a more careful reading of the help would have saved us some trouble.
Although it would be desirable for polym() to implement safe predictions, it might not be worth the effort, and a reasonable work-around is to set raw=TRUE. Though this will make the regression coefficients of the polynomial less numerically stable, it should not, I think, adversely affect predictions as long as the x's are within the configuration of the data to which the model is fit.
Best,
John
--------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.socsci.mcmaster.ca/jfox
On Fri, 24 Jan 2014 13:26:04 +0000
Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On 24/01/2014 02:09, Xing Zhao wrote:
> > Hi everyone,
> >
> > R documents says the safe prediction is implemented, when basis
> > functions are used, such as poly(), bs(), ns()
> >
> > This works for univariate basis, but fails in my bivariate polynomial setting.
> > Can anyone explain the reason?
>
> Because there is a makepredictcall() method for class "poly", and bivariate polynomials are not of that class (see ?poly). The documentation only says safe prediction is available for 'polynomial' fits, and 'bivariate polynomial' is not conventionally included in 'polynomial'.
>
> Your call is really to polym(), not to poly(), and it may be better to call polym() explicitly to remind yourself.
>
> >
> >
> > The following is a small example.
> >
> > set.seed(731)
> > x<-runif(300)
> > y<-runif(300)
> >
> > f <- function(x, y) { 0.5+2*x+13*x^2-14*y^2+12*x*y+y }
> >
> > z <- f(x,y)+rnorm(length(x))*0.2
> >
> > # default orthogonal polynomials basis
> > mod <- lm (z ~ poly(x,y,degree = 2))
> >
> > # raw polynomials basis
> > mod1 <- lm (z ~ poly(x,y,degree = 2, raw = T))
> >
> > # data points to evaluate, just the first five data
> > new <- data.frame(x=x[1:5],y= y[1:5])
> >
> > z[1:5]
> > [1] 9.796620 10.397851 1.280832 4.028284 4.811709
> >
> > # two predicted values differ dramatically, and the orthogonal
> > polynomials basis fails
> > predict(mod, new)
> > 1 2 3 4 5
> > 121.46776 40.85002 18.67273 31.82417 20.81673
> > predict(mod1, new)
> > 1 2 3 4 5
> > 9.981091 10.418628 1.223148 4.031664 4.837099
> >
> > # However, the fitted.values are the same
> > mod$fitted.values[1:5]
> > 1 2 3 4 5
> > 9.981091 10.418628 1.223148 4.031664 4.837099
> > mod1$fitted.values[1:5]
> > 1 2 3 4 5
> > 9.981091 10.418628 1.223148 4.031664 4.837099
> >
> >
> > Thanks in advance
> > Xing
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list