[Rd] Problems with predict.lm: incorrect SE estimate (PR#7772)

msa at biostat.mgh.harvard.edu msa at biostat.mgh.harvard.edu
Mon Apr 4 18:01:05 CEST 2005


Full_Name: Marek Ancukiewicz
Version: 2.01
OS: Linux
Submission from: (NULL) (132.183.12.87)


It seems that the the standard error of prediction of the linear regression,
caclulated with predict.lm is incorrect. Consider the following example where
the standard error is first calculated with predict.lm, then using delta
method. and finally, using the formula rms*sqrt(1+1/n+(xp-x0)^2/Sxx). 

Marek Ancukiewicz

> n <- 10
> x <- 1:n
> y <- x
> y[c(2,4,6)] <- y[c(2,4,6)] + 0.1
> y[c(3,5,7)] <- y[c(3,5,7)] - 0.1
> a <- lm(y~x)
> rms <- sqrt(sum(a$residuals^2)/(n-2))
> s <- covmat(a)*rms^2
> xp <- 3
> xm <- mean(x)
> summary(a)

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.10909 -0.07500  0.01091  0.06955  0.10182 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.020000   0.058621   0.341    0.742    
x           0.996364   0.009448 105.463  7.3e-14 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

Residual standard error: 0.08581 on 8 degrees of freedom
Multiple R-Squared: 0.9993,     Adjusted R-squared: 0.9992 
F-statistic: 1.112e+04 on 1 and 8 DF,  p-value: 7.3e-14 

> print(predict(a,new=data.frame(x=xp),se.fit=T))
$fit
[1] 3.009091

$se.fit
[1] 0.0359752

$df
[1] 8

$residual.scale
[1] 0.08581163

> print(se.delta.method <- sqrt(s[1,1]+xp^2*s[2,2]+2*xp*s[1,2] + rms^2))
[1] 0.09304758
> print(se.ss.formula <- rms*sqrt(1+1/n+(xp-xm)^2/sum((x-xm)^2)))
[1] 0.09304758



More information about the R-devel mailing list