[R] standard errors for predict.nls?

Rolf Turner r.turner at auckland.ac.nz
Sun Nov 9 21:02:27 CET 2008


On 7/11/2008, at 11:33 PM, Christoph Scherber wrote:

> Dear all,
>
> I would like to get standard errors (or confidence intervals) for  
> *predicted* values from an nls fit.
>
> I have tried to implement code from p.225 in MASS (bootstrapping a  
> nls fit), but this gives only the
> confidence intervals of the parameter estimates, but not an overall  
> confidence interval for the
> predicted value(s):
>
> puro1<-nls(rate~a*conc/(b+conc), data=Puromycin[1:12,], start=list 
> (a=200, b=1))  #set up nls model
>
> # assume only one predicted value is obtained using predict 
> (puro1,list(conc=0.02)):
>
> st=cbind(Puromycin[1:12,],fit=predict(puro1,list(conc=0.02)))
>
> puro.bf=function(rs,i){
>         st$rate=st$fit+rs[i]
>         coef(nls(rate~a*conc/(b+conc), st, start=coef(puro1)))
> }
>
> rs=scale(resid(puro1),scale=F)
> (puro.boot=boot(rs,puro.bf,R=100))
>
> boot.ci(puro.boot,index=1,type=c("norm"))
> boot.ci$t0
> boot.ci$norm
> ###
>
> How do I have to change the code to get the c.i. for the predicted  
> value?



First:  What you want is ***NOT*** a confidence interval, it is a  
prediction
interval.

There are in effect two components --- a ``confidence part'' and a  
``predict a new
observation'' part.

The ``confidence part'' is actually the stumbling block.

You are fitting a model

	Y = f(x,theta) + E

where the function f() is non-linear in theta.  You are probably  
willing to assume that
the E_i are i.i.d. Gaussian, mean 0, standard deviation sigma.

The nls fit gives you an estmate of theta and of the standard error  
in this estimate.
It also gives you an estimate of sigma.

A 95% confidence level *prediction interval* for a new value of Y at  
a given value of x
is given (if the sample size is fairly large) by

	f(x,theta-hat) +/- 1.96 * sqrt( (SE-fit)^2 + s^2 )

where s^2 is your (readily available) estimate of sigma^2 and SE-fit  
is the estimate
of the standard error of f(x,theta-hat).  The problem is that SE-fit  
is *not* readily
available (even though the standard errors of theta-hat *are*) since f 
() is non-linear
in theta.

As was previously discussed in this thread, the ``usual'' method of  
getting at SE-fit
is the so-called ``delta method'' and as Professor Ripley has pointed  
out this is in
general insufficiently accurate to be really useful.

Bootstrapping has been suggested as an alternative approach to  
getting at SE-fit.

To use boot-strapping for this you must calculate and retain f 
(x,theta-hat*) for each bootstrap
sample and then calculate the standard error of the values of f 
(x,theta-hat*).
(This is probably somewhat naive --- there may be subtleties here  
that I've overlooked
and more sophisticated steps that one can take to address these  
subtleties.  I am
definitely not an expert on bootstrapping, so look elsewhere for  
better advice here.)

If your sample size is ``reasonably large'' then it is ``reasonably  
safe'' to assume
that SE-fit is small (negligible) in comparison with s and can be  
ignored, i.e. set
to 0.  This is the approach that is (almost?) universally adopted in  
constructing prediction
intervals in (ARMA) time series analysis.

	cheers,

		Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}



More information about the R-help mailing list