[R] weighting in nls
Berton Gunter
gunter.berton at gene.com
Thu Jan 27 18:52:30 CET 2005
Robert:
So far as I can see:
1. Your specification is correct, (note,however, that weights of the form
sqrt(n) are all that you need, as a constant multiplier makes no
difference). However, as you have not specified a gradient attribute,
numerical derivatives are being used by nls() while PRISM may be using
analytical derivatives if your model is one of those on its standard list,
which I suspect is the case. As a result, the particular data and weights
that you have may be causing the nls algorithm to terminate too early, as
numerical derivatives (differences) often require more iterations. So I
suggest you try increasing the maxiter parameter of the control list, say by
control = nls.control(maxiter=100).
2. Also note that for a fixed b and c, the model is linear in a^3. So this
would allow use of the "plinear" algorithm, which might considerably improve
convergence behavior (as there are then only 2, not 3, nonlinear
parameters). V&R's MASS section on nls() again contains a good explanation
of how to do this.
That the algorithm worked better with unweighted than weighted data means
little, as you are at the mercy of the particulars of the interaction
between optimization algorithm and likelihood surface ... that is, the
particular data including weighting. Also, you are correct afaik: the
standard errors and confidence intervals from nls are incorrect for weighted
fits, though of course they usually aren't that meaningful for unweighted
fits either (I don't think they are even necessarily asymptotically correct,
as they are merely based on the local likehood surface at the converged
value, which does not in any sense take into account the uncertainty of the
fitting algorithm -- however, others may (eagerly) point out the errors of
my ways, here).
As Always, standard discalimers apply, especially the one about my not being
an expert on this... so caveat emptor!
Cheers,
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
"The business of the statistician is to catalyze the scientific learning
process." - George E. P. Box
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Robert
> Brown FM CEFAS
> Sent: Thursday, January 27, 2005 7:35 AM
> To: Liaw, Andy
> Cc: r-help at stat.math.ethz.ch
> Subject: RE: [R] weighting in nls
>
>
> Hi there,
>
> this is the output from R
>
> >
> solb2wvb<-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c))))^
> 3),data=solb2.na.rm,start=list(a=0.85,b=0.45,c=0.48))
> > summary(solb2wvb)
>
> Formula: ~ sqrt(novervar) * (weight - (a * (1 - exp( - b *
> (age - c))))^3)
>
> Parameters:
> Value Std. Error t value
> a 1.087370 0.01193090 91.1392
> b 0.151838 0.00714963 21.2372
> c -1.809770 0.13186000 -13.7250
>
> Residual standard error: 4.41368 on 109 degrees of freedom
>
> The output from Prism is:
>
> von Bertalanffy
> Best-fit values
> A 0.8957
> B 0.2381
> C -1.358
> Std. Error
> A 0.002280
> B 0.002568
> C 0.02919
> 95% Confidence Intervals
> A 0.8912 to 0.9001
> B 0.2331 to 0.2431
> C -1.415 to -1.300
>
> The latter has much better visual fit and reasonable
> residuals. Furthermore theory and practice both lead to the
> expectation that this model should fit the data.
>
> Incidentally, I was under the impression that with a weighted
> nls in R the SE values were not accurate.
>
> Finally I've attached the dataset
>
>
>
>
> -----Original Message-----
> From: Liaw, Andy [mailto:andy_liaw at merck.com]
> Sent: 27 January 2005 15:25
> To: Robert Brown FM CEFAS; r-help at stat.math.ethz.ch
> Subject: RE: [R] weighting in nls
>
>
> Can you show us the difference; i.e., what are the parameter
> estimates and
> associated SEs from the two programs? Even better, can you supply an
> example data set?
>
> [With is `trick' for weighted nls, you need to be careful
> with the output of
> predict().]
>
> Andy
>
> > From: Robert Brown FM CEFAS
> >
> > I'm fitting nonlinear functions to some growth data but I'm
> > getting radically different results in R to another program
> > (Prism). Furthermore the values from the other program give a
> > better fit and seem more realistic. I think there is a
> > problem with the results from the r nls function. The
> > differences only occur with weighted data so I think I'm
> > making a mistake in the weighting. I'm following the
> > procedure outlined on p 244 of MASS (or at least I'm trying to).
> >
> > Thus, I'm using mean data with heteroscedasticity so I'm
> > weighting by n/ variance, where the variance is well known
> > from a large data set. This weighting factor is available as
> > the variable 'novervar'.
> >
> > The function is a von Bertalanffy curve of the form
> > weight~(a*(1-exp(-b*(age-c))))^3. Thus I'm entering the
> > command in the form:
> >
> > solb1wvb<-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c))))^
> > 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48))
> >
> > Can anyone suggest what I'm doing wrong? I seem to be
> > folowing the instructions in MASS. I tried following the
> > similar instructions on page 450 of the white book but these
> > were a bit cryptic.
> >
> > I'm using R 2.0.0 on a Windows 2000 machine
> >
> > Regards,
> >
> > Robert Brown
> >
> >
> > **************************************************************
> > *********************
> > This email and any attachments are intended for the named
> > re...{{dropped}}
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> >
>
>
> --------------------------------------------------------------
> ----------------
> Notice: This e-mail message, together with any attachments,
> contains information of Merck & Co., Inc. (One Merck Drive,
> Whitehouse Station, New Jersey, USA 08889), and/or its
> affiliates (which may be known outside the United States as
> Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> Banyu) that may be confidential, proprietary copyrighted
> and/or legally privileged. It is intended solely for the use
> of the individual or entity named on this message. If you
> are not the intended recipient, and have received this
> message in error, please notify us immediately by reply
> e-mail and then delete it from your system.
> --------------------------------------------------------------
> ----------------
>
>
> **************************************************************
> *********************
> This email and any attachments are intended for the named
> recipient only. Its unauthorised use, distribution,
> disclosure, storage or copying is not permitted. If you have
> received it in error, please destroy all copies and notify
> the sender. In messages of a non-business nature, the views
> and opinions expressed are the author's own and do not
> necessarily reflect those of the organisation from which it
> is sent. All emails may be subject to monitoring.
> **************************************************************
> *********************
>
>
More information about the R-help
mailing list