[R] problem with predict() for gam() models
William M. Grove
William.M.Grove-1 at tc.umn.edu
Sat Jun 7 04:16:33 CEST 2003
I run the following code in R 1.6.2 on Windows:
xxx <- rnorm(100)
yyy <- .5 * rnorm(100) + sqrt(1-.5^2) * rnorm(100)
ord <- order(xxx)
xxx <- xxx[ord] # for
yyy <- yyy[ord] # convenience in reading printout
rm(ord)
reg.gam <- gam(yyy ~ s(xxx, k=8))
f <- function(x, reg.gam, target.y) {
cat("inside f() called by optimize():\n")
cat("arg x=", x, "\n")
cat("arg target.y=", target.y, "\n", sep="")
pred.y <- predict(reg.gam, newdata=data.frame(constant=1, xxx=x))
cat("predicted y=\n")
print(pred.y)
sq.diff <- (pred.y - target.y)^2
cat("returned value=\n")
print(sq.diff)
return( sq.diff )
}
temp <- optimize(f, interval=c(xxx[1], xxx[100]), reg.gam=reg.gam,
target.y=mean(range(yyy)))
as a try-out of some code. I need to interpolate in the output of a gam()
model, to find a certain x-value corresponding to a chosen y-value. I
thought it would be better to interpolate using predict() from the gam()
model, rather than use approx() since I don't expect linearity, or anything
that's necessarily very close to it, in the part of the regression where
I'm going to need to interpolate. Since the regression itself is on
splines, I didn't think I needed to call spline interpolation; I thought
In my real-life application the regression isn't linear, like the one I
constructed here (it looks somwhat, but not quite enough, like a logistic
regression), but I thought it would simplify my example to construct do
things this way.
Alas, with these data (or my real, nonlinear, data) I get an error from
optimize() (either with these fake data, or with my real data) because f()
delivers a vector-valued result. The debugging print statements inside f()
show that sometimes (but not nearly always) even though a scalar argument x
is submitted and put into the newdata to predict(), I get a vector result
pred.y[]. Note: most of the time, I get a scalar valued result, but often
enough I get a vector output from f(). I don't at this time see what
differentiates arguments that yield scalar result for f(), from ones that
yield a vector value for f(). x the argument to f() is always scalar.
The problem occurs whether I use newdata=data.frame(xxx=x) (i.e., minus the
"constant=1" part, or newdata as shown, as an argument to predict().
Can anyone help me understand why this is happening, and how to ensure that
the result of f() is scalar, so I get the interpolation I want, instead of
having optimize() blow up?
Thanks in advance for any advice. Regards,
More information about the R-help
mailing list