[R] predict.lm - standard error of predicted means?
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Jul 20 18:09:03 CEST 2005
kehler at mathstat.dal.ca writes:
> Simple question.
>
> For a simple linear regression, I obtained the "standard error of
> predicted means", for both a confidence and prediction interval:
>
> x<-1:15
> y<-x + rnorm(n=15)
> model<-lm(y~x)
> predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="confidence")$se.fit
> 1 2
> 0.2708064 0.7254615
>
> predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="prediction")$se.fit
> 1 2
> 0.2708064 0.7254615
>
>
> I was surprised to find that the standard errors returned were in fact the
> standard errors of the sampling distribution of Y_hat:
>
> sqrt(MSE(1/n + (x-x_bar)^2/SS_x)),
>
> not the standard errors of Y_new (predicted value):
>
> sqrt(MSE(1 + 1/n + (x-x_bar)^2/SS_x)).
>
> Is there a reason this quantity is called the "standard error of predicted
> means" if it doesn't relate to the prediction distribution?
Yes. Yhat is the predicted mean and se.fit is its standard deviation.
It doesn't change its meaning because you desire another kind of
prediction interval.
> Turning to Neter et al.'s Applied Linear Statistical Models, I note that
> if we have multiple observations, then the standard error of the mean of
> the predicted value:
>
> sqrt(MSE(1/m + 1/n + (x-x_bar)^2/SS_x)),
>
> reverts to the standard error of the sampling distribution of Y-hat, as m,
> the number of samples, gets large. Still, this doesn't explain the result
> for small sample sizes.
You can make completely similar considerations regarding the standard
errors of and about an estimated mean: sigma*sqrt(1+1/n) vs.
sigma*sqrt(1/m + 1/n) vs. sigma*sqrt(1/n). SEM is still the latter
quantity even if you are interested in another kind of prediction limit.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list