[R] Help with predict.lm

Wed Apr 20 12:16:12 CEST 2005

Ted
Thank you for giving me so much of your time to provide an explanation of
the likelihood ratio approach to the calibration problem.  It has certainly
provided me with a new insight into this problem. I will check out the
references you mentioned to get a better understanding of the statistics.
Out of interest I have also tried the function provided by David Lucy back
in March 2002 with a correction to the code and addition of the t-value.
The results are very close to those obtained by the likelihood ratio method,
although the output needs tidying up.

####################
# R function to do classical calibration
# val is a data.frame of the new indep variables to predict dep

calib <- function(indep, dep, val, prob=0.95)
{

 # generate the model y on x and get out predicted values for x
        reg <- lm(dep~indep)
        xpre <- (val - coef(reg)[1])/coef(reg)[2]

        # generate a confidence
        yyat <- ((dep - predict(reg))^2)
        sigyyat <- ((sum(yyat)/(length(dep)-2))^0.5)
        term1 <- sigyyat/coef(reg)[2]
        sigxxbar <- sum((indep - mean(indep))^2)
        denom <- sigxxbar * ((coef(reg)[2])^2)
 t<-qt(p=(prob+(1-prob)/2), df=(length(dep)-2))
        conf <- t*abs((((1+1/length(dep))+(((val -
  mean(dep))^2)/denom))^0.5)*term1)

results<-list(Predicted=xpre, Conf.interval=c(xpre-conf, xpre+conf))
return(results)
}

conc<-seq(100, 280, 20) #  mg/l
abs<-c(1.064, 1.177, 1.303, 1.414, 1.534, 1.642, 1.744, 1.852, 1.936, 2.046)
# absorbance units

new.abs<-data.frame(abs=c(1.251, 1.324, 1.452))

calib(conc, abs, new.abs, prob=0.95)

$Predicted
       abs
1 131.2771
2 144.6649
3 168.1394

$Conf.interval
$Conf.interval$abs
[1] 124.4244 137.9334 161.5477

$Conf.interval$abs
[1] 138.1298 151.3964 174.7310

Thanks
Mike

----- Original Message -----
From: "Ted Harding" <Ted.Harding at nessie.mcc.ac.uk>
To: "Mike White" <mikewhite.diu at tiscali.co.uk>
Cc: <R-help at stat.math.ethz.ch>
Sent: Wednesday, April 20, 2005 8:58 AM
Subject: RE: [R] Help with predict.lm

> Sorry, I was doing this too late last night!
> All stands as before except for the calculation at the end
> which is now corrected as follows:
>
> On 19-Apr-05 Ted Harding wrote:
> [code repeated for reference]
> > The following function implements the above expressions.
> > It is a very crude approach to solving the problem, and
> > I'm sure that a more thoughtful approach would lead more
> > directly to the answer, but at least it gets you there
> > eventually.
> >
> > ===========================================
> >
> > R.calib <- function(x,y,X,Y){
> >   n<-length(x) ; mx<-mean(x) ; my<-mean(y) ;
> >   x<-(x-mx) ; y<-(y-my) ; X<-(X-mx) ; Y<-(Y-my)
> >
> >   ah<-0 ; bh<-(sum(x*y))/(sum(x^2)) ; Xh <- Y/bh
> >           sh2 <- (sum((y-ah-bh*x)^2))/(n+1)
> >
> >   D<-(n+1)*sum(x^2) + n*X^2
> >   at<-(Y*sum(x^2) - X*sum(x*y))/D; bt<-((n+1)*sum(x*y) + n*X*Y)/D
> >   st2<-(sum((y - at - bt*x)^2) + (Y - at - bt*X)^2)/(n+1)
> >
> >   R<-(sh2/st2)
> >
> >   F<-(n-2)*(1-R)/R
> >
> >   x<-(x+mx) ; y<-(y+my) ;
> >   X<-(X+mx) ; Y<-(Y+my) ; Xh<-(Xh+mx) ;
> >   PF<-(pf(F,1,(n-2)))
> >   list(x=x,y=y,X=X,Y=Y,R=R,F=F,PF=PF,
> >        ahat=ah,bhat=bh,sh2=sh2,
> >        atil=at,btil=bt,st2=st2,
> >        Xhat=Xh)
> > }
> >
> > ============================================
> >
> > Now lets take your original data and the first Y-value
> > in your list (namely Y = 1.251), and suppose you want
> > a 95% confidence interval for X. The X-value corresponding
> > to Y which you would get by regressing x (conc) on y (abs)
> > is X = 131.3813 so use this as a "starting value".
> >
> > So now run this function with x<-conc, y<-abs, and these values
> > of X and Y:
> >
> >   R.calib(x,y,131.3813,1.251)
> >
> > You get a long list of stuff, amongst which
> >
> >   $PF
> >   [1] 0.02711878
> >
> > and
> >
> >   $Xhat
> >   [1] 131.2771
> >
> > So now you know that Xhat (the MLE of X for that Y) = 131.2771
> > and the F-ratio probability is 0.027...
> >
> *****> You want to push $PF upwards till it reaches 0.05, so work
> *****> *outwards* in the X-value:
> WRONG!! Till it reaches ***0.95***
>
>   R.calib(x,y,125.0000,1.251)$PF
>   [1] 0.9301972
>
>   ...
>
>   R.calib(x,y,124.3510,1.251)$PF
>   [1] 0.949989
>
>
> > and you're there in that direction. Now go back to the MLE
> > and work out in the other direction:
> >
> >   R.calib(x,y,131.2771,1.251)$PF
> >   [1] 1.987305e-06
>
>   ...
>
>   R.calib(x,y,138.0647,1.251)$PF
>   [1] 0.95
>
> > and again you're there.
> >
> > So now you have the MLE Xhat = 131.2771, and the two
> ****> limits of a 95% confidence interval (131.0847, 131.4693)
> WRONG!!!
> limits of a confidence interval (124.3510, 138.0647)
>
> > for X, corresponding to the given value 1.251 of Y.
>
> As it happens, these seem to correspond very closely to
> what you would get by reading "predict" in reverse:
>
> > plot(x,y)
> > plm<-lm(y~x)
> > min(x)
>   [1] 100
> > max(x)
>   [1] 280
>
> > points((131.2771),(1.251),col="red",pch="+") #The MLE of X
> > lines(c(124.3506,138.0647),c(1.251,1.251),col="red") # The above CI
> > newx<-data.frame(x=(100:280))
> > predlm<-predict(plm,newdata=newx,interval="prediction")
> > lines(newx$x,predlm[,"fit"],col="green")
> > lines(newx$x,predlm[,"lwr"],col="blue")
> > lines(newx$x,predlm[,"upr"],col="blue")
>
> which is what I thought would happen in the first place, given
> the quality of your data.
>
> Sorry for any inconvenience due to the above error.
> Ted.
>
>
>
>
>
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 20-Apr-05                                       Time: 08:58:37
> ------------------------------ XFMail ------------------------------
>