[R]: Prediction interval for a Gaussian family log-link model

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Oct 22 12:07:21 CEST 2003


Wayne Jones <JonesW at kssg.com> writes:

> Hi there fellow R-users, 
> 
> Can anyone tell me how to build  a prediction interval for a gaussian
> log-link model for the reponse variable??
> I can find the standard error of the predictions but I cant seem to find the
> prediction interval. Is there a way I can calculate the 
> prediction interval from the standard errors??
> 
> Here's the example: 
> 
> logX<-rnorm(100)
> logY<--2-0.5*logX+rnorm(100,0,0.4)
> Y<-exp(logY)
> my.glm.mod<-glm(Y~logX,family=gaussian(link="log"))
> predict(my.glm.mod,type="response",se.fit=TRUE)

Er, that's not the right model for those data! 

Y <- exp(-2-0.5*logX)+rnorm(100,0,0.4)

would be a log-link gaussian model (but a smaller SD would be more
interesting). Assuming that this is what you intended,

logX<-sort(rnorm(100)) 
Y <- exp(-2-0.5*logX)+rnorm(100,0,0.04)
my.glm.mod<-glm(Y~logX,family=gaussian(link="log"), etastart=pmax(Y,0))
pp <- predict(my.glm.mod,type="response",se.fit=TRUE)
bounds <- pp$fit + outer(sqrt(pp$residual.scale^2+pp$se.fit^2),
                         qt(c(.05,.95),100-2))

plot(logX,Y)
matlines(logX,bounds)


BTW, we seem to have a problem with the gaussian(log) starting values
if the observations are negative...

Notice that this *only* works for gaussian responses and relies on the
approximate normality of the predicted values. In general you have to
compute the convolution of the error distribution with the
distribution of the fitted value (possibly approximate). With discrete
response variables, all bets are off: It is not at all clear what a
prediction interval for binary response means.

If you really intended a lognormal distribution for Y, and a loglinear
relation with logX, just fit a linear model for log(Y) and transform
everything back.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list