[R] strange behavior of loess() & predict()
Leo Gürtler
leog at anicca-vijja.de
Tue Dec 6 18:09:11 CET 2005
Dear altogether,
I tried local regression with the following data. These data are a part
of a bigger dataset for which loess is no problem.
However, the plot shows extreme values and by looking into the fits, it
reveals very extreme values (up to 20000 !) although the original data are
> summary(cbind(x,y))
x y
Min. :1.800 Min. :2.000
1st Qu.:2.550 1st Qu.:2.750
Median :2.800 Median :3.000
Mean :2.779 Mean :3.093
3rd Qu.:3.050 3rd Qu.:3.450
Max. :4.000 Max. :4.000
>
As you can see below, the difference lies in the line
predict(mod, data.frame(x=X), se=TRUE) # strange values
predict(mod, x=X, se=TRUE) # plausible values
What is the difference whether predict() is called via
data.frame(x=X) or "just" x=X ????
Here are the data + R-code. It can be repoduced.
<--- snip --->
# data
x <-
c(3.4,2.8,2.6,2.2,2.0,2.8,2.6,2.6,2.8,4.0,2.4,2.8,3.0,3.6,3.2,2.8,3.2,2.4,2.2,1.8,2.8,2.0,3.6,2.6,2.8,3.2,3.0,2.6)
y <-
c(3.0,2.6,2.8,2.6,3.0,4.0,3.6,2.4,3.0,4.0,2.4,3.4,3.0,3.2,2.8,3.4,3.4,3.8,3.8,3.6,3.2,2.4,3.8,3.0,3.0,2.0,2.6,2.8)
par(mfrow=c(2,1))
# normal plot
plot(x,y)
lines(lowess(x,y))
# loess part
mod <- loess(y ~ x, span=.5, degree=1)
X <- seq(min(x), max(x), length=50)
fit <- predict(mod, data.frame(x=X), se=TRUE)
zv <- qnorm((1 + .95)/2)
lower <- fit$fit - zv*fit$se
upper <- fit$fit + zv*fit$se
plot(x, y, ylim=range(y, lower, upper))
lines(X, fit$fit)
# strange values in fit
fit
# here is the difference!!
predict(mod, data.frame(x=X), se=TRUE)
predict(mod, x=X, se=TRUE)
<--- end of snip --->
I assume this has some reason but I do not understand this reason.
Merci,
best regards
leo gürtler
More information about the R-help
mailing list