[R] Unexpected behavior of predict and interval="confidence"

Kevin Middleton kmm at mid-town.org
Sun Oct 29 18:54:40 CET 2006


Based on some recent r-help discussions, I have been trying out  
plotting confidence intervals using predict and matplot. Matplot  
appeared to not be plotting the linear regression when using the  
default column names generated by read.table (V1, V2, etc). On  
further inspection, the error seemed to be with predict and vector  
names (V1, V2) rather than with matplot. I was using some textbook  
data sets that were read with read.table, so my data.frame columns  
were named by default. The problem seems to be related to the name of  
the vector only (though I may be wrong).

The example below, based on that in ?predict.lm, better describes the  
problem. Using predict with V2~V1 & y~x produces identical output  
(when x<-V1 and y<-V2). However, using predict with  
interval="confidence" results in different output from the same data.  
That with y~x is correct, and V2~V1 is "incorrect".

This may be related to a previous post on r-help:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/56982.html

I can't figure out why there would be a difference in predict when  
only the vector name seemingly changes. Granted, I could just read  
the data.frame is as "x" and "y."

Thanks
Kevin

######################

set.seed(10)

# For example purposes, plot side by side
par(mfrow=c(1,2))

V1 <- rnorm(15)
V2 <- V1 + rnorm(15)

new <- data.frame(x = seq(min(V1), max(V1), length.out = length(V1)))

pred.w.clim <- predict(lm(V2 ~ V1), new, interval="confidence")
matplot(new$x, pred.w.clim,
         lty=c(1,2,2), type="l",
         col=c("black", "red", "red"),
         ylab="predicted y")
points(V1,V2)


# Create x & y equal to V1 & V2
x <- V1
y <- V2

pred.w.clim2 <- predict(lm(y ~ x), new, interval="confidence")
matplot(new$x, pred.w.clim2,
         lty=c(1,2,2), type="l",
         col=c("black", "red", "red"),
         ylab="predicted y")
points(x,y)

# Test if V1=x, V2=y
all.equal(x,V1)
all.equal(y,V2)

# Same output with predict
predict(lm(V2 ~ V1))
predict(lm(y ~ x))
all.equal(predict(lm(V2 ~ V1)), predict(lm(y ~ x)))

# Different output with interval="confidence"
pred.w.clim
pred.w.clim2
all.equal(pred.w.clim, pred.w.clim2)



More information about the R-help mailing list