[R] strange behavior of loess() & predict()

Tue Dec 6 18:09:11 CET 2005

Dear altogether,

I tried local regression with the following data. These data are a part 
of a bigger dataset for which loess is no problem.
However, the plot shows extreme values and by looking into the fits, it 
reveals very extreme values (up to 20000 !) although the original data are

 > summary(cbind(x,y))
       x               y       
 Min.   :1.800   Min.   :2.000 
 1st Qu.:2.550   1st Qu.:2.750 
 Median :2.800   Median :3.000 
 Mean   :2.779   Mean   :3.093 
 3rd Qu.:3.050   3rd Qu.:3.450 
 Max.   :4.000   Max.   :4.000 
 >

 As you can see below, the difference lies in the line

predict(mod, data.frame(x=X), se=TRUE)   # strange values
predict(mod, x=X, se=TRUE)                     # plausible values

What is the difference whether predict() is called via

data.frame(x=X) or "just" x=X ????

Here are the data + R-code. It can be repoduced.

<--- snip --->

# data
x <- 
c(3.4,2.8,2.6,2.2,2.0,2.8,2.6,2.6,2.8,4.0,2.4,2.8,3.0,3.6,3.2,2.8,3.2,2.4,2.2,1.8,2.8,2.0,3.6,2.6,2.8,3.2,3.0,2.6)
y <- 
c(3.0,2.6,2.8,2.6,3.0,4.0,3.6,2.4,3.0,4.0,2.4,3.4,3.0,3.2,2.8,3.4,3.4,3.8,3.8,3.6,3.2,2.4,3.8,3.0,3.0,2.0,2.6,2.8)

par(mfrow=c(2,1))

# normal plot
plot(x,y)
lines(lowess(x,y))

# loess part
mod <- loess(y ~ x, span=.5, degree=1)
X <- seq(min(x), max(x), length=50)
fit <- predict(mod, data.frame(x=X), se=TRUE)
zv <- qnorm((1 + .95)/2)
lower <- fit$fit - zv*fit$se
upper <- fit$fit + zv*fit$se
plot(x, y, ylim=range(y, lower, upper))
lines(X, fit$fit)

# strange values in fit
fit

# here is the difference!!
predict(mod, data.frame(x=X), se=TRUE)
predict(mod, x=X, se=TRUE)

<--- end of snip --->

I assume this has some reason but I do not understand this reason.
Merci,

best regards

leo gürtler