[R] confidence interval of a average...
Robert W. Baer, Ph.D.
rbaer at atsu.edu
Thu Nov 25 03:05:37 CET 2004
> Sorry if this was not clear. This is more of a theoreticla question
> rather than a R-coding question. I need to calculate
>
> "The predicted response and 95% prediction interval for a man of average
> height"
>
> So I need to predict the average response, which is easily done by taking
> the mean height and using the regression formula.
>
> However, "average height" has to be calculated from the sample, and thus I
> have confidence in that. Let's say the mean is 163cm, I think that I
> can't take the 163cm value and calculate the CI from just the sd of the
> lung capacity because that would be too narrow; I think covariance must
> come into it somehow, or can I just do a 97.5% CI on the height and take
> those extreme values and do a 97.% CI on them?
Then, you want the predition interval on the mean VC which is the thighter
of the two confidence intervals and does not include the extra variability
of VC about its mean. As always with confidence intevals, you are free to
look at either 95% CI or 97.5% CI depending on what kind of satement you'd
like to make about your confidence. I don't not understand you comment
about covariance at all.
Let me try again with data in your units. Note that CI varies with height
and is smallest at the mean height whether you are talking about CI on the
mean VC or CI on the predicted VC. For comparison, the red lines are the
95% CI on mean regression fit VC and the blue lines are 95% CI on
"predicted" VC. The simulated data is set to have a mean height that
varies around 163 cm.
# Make simulated data with mean height near 163
# vc approximately in liter values with scatter
height=sort(rnorm(50,mean=163,sd=35))
vc=0.03*height+.5*rnorm(50)
#Plot the simulated data
plot(vc~height,ylab='vital capacity (l)',xlab='Height (cm)')
# Set up data frame with values of height you wish a ci on
# column heading must be same as for lm() fit x variable
# in this case, dataframe contains only mean height
mean.height.fit.ci=data.frame(height=mean(height))
#print out the mean height
mean.height.fit.ci
# fit the regression model
vc.lm=lm(vc~height)
#Draw 95% confidence intervals on mean vc at various heights(red) (min at
mean(height)
matlines(height,predict.lm(vc.lm,interval="c"),lty=c(1,2,2),
col=c('black','red','red'))
#Draw 95% confidence intervals on new vc at various heights(blue) (min again
at mean(height)
matlines(height,predict.lm(vc.lm,interval="p"),lty=c(1,3,3),
col=c('black','blue','blue'))
# Determine 95% CI on mean vc at mean height
predict.lm(vc.lm,mean.height.fit.ci,interval="confidence")
# Determine 97.5 5% CI on mean vc at mean height
predict.lm(vc.lm,mean.height.fit.ci,interval="confidence", level=0.975)
You might wish to read a little more about regression CIs in a good
statistics book.
HTH,
Rob
More information about the R-help
mailing list