[Rd] Using the nls package

Prof Brian D Ripley ripley@stats.ox.ac.uk
Mon, 31 Jul 2000 07:20:32 +0100 (BST)


On Mon, 31 Jul 2000, Telford Tendys wrote:

> On Sat, Jul 29, 2000 at 06:59:53PM -0500, Douglas Bates wrote:
> > The convergence criterion for nls is the relative offset orthogonality
> > convergence criterion as described in 
> > @Article{bate:watt:1981:tech,
> >   journal =      "Technometrics",
> >   volume =       23,
> >   pages =        "179--183",
> >   keywords =     "Regression",
> >   author =       "Douglas M. Bates and Donald G. Watts",
> >   title =        "A Relative Offset Orthogonality Convergence Criterion
> >                  for Nonlinear Least Squares",
> >   year =         1981
> > }
> 
> OK, I'll try to look that one up. You probably have a very good
> convergence criterion for realistic sample data (which will always
> have some error).
> 
> > The problem with putting in a check for an exceptionally small sum of
> > squares is deciding when it is exceptionally small.  There is no
> > absolute scale on the residual sum of squares.
> > 
> > We could perhaps detect an exact zero but even that would not catch
> > all the cases where "exact" values were simulated because generally
> > they will give a small but non-zero sum of squares.
> 
> Hmmm, I see your point here... the size of the residual sum of squares can 
> change deastically when you add or delete measurement points, try different
> models, etc so it's value only has meaning when relative to other residual
> sum of squares for the same model, for the same data points.
> 
> How about testing for progress? If the nls algorithm is giving the same
> numbers on two consecutive cycles then we can be sure it is not going to
> get any better with more iterations so:

How can you be sure?  What are `the same numbers'?  The parameters
unchanged to machine precision, yes, but that's not what you print.


> [1] keep a copy of the parameters from last cycle
> [2] keep a copy of the sum of square residuals from last cycle
> [3] if the sum of the residuals has changed at all then this cycle
>     is making progress so keep cycling
> [4] if the sum of residuals has not changed, check if the parameters
>     have changed, if they have not changed then give up because
>     next cycle will be same as this one.
> 
> Maybe [1] and [4] are not necessary and checking for change in the
> sum of square residuals is enough but I am vaguely aware that it might
> be possible to have the parameters change without the sum of square
> residuals changing (I would have to study the algorithm very closely
> to figure out whether that was possible).
> 
> Anyhow, would copying the parameters each cycle be a big overhead?
> 
> > Since such problems will not occur in practice but only in artificial
> > examples, we have not bothered detecting them.
> 
> Yes, this is understandable but if most people are like me (questionable
> assumption I admit) they try a few artificial examples first to get
> the general idea of how these things work. I guess there is a certain
> illogical assumption that if I try what seems to me to be an obviously
> easy problem and it cannot handle it then something must be wrong and
> trying harder problems is not going to help.

Perhaps you need to read up on the algorithmic theory so you are not so
easily confused.  NLS problems with an exact-fit solution are a different
sub-class with different effective algorithms.

> > > Is there any chance of getting this to work? Should I split up the
> > > complex numbers into components and write my equations are equivalent
> > > expressions in reals (ugly but I could do it if I have to)?
> > 
> > That's the best short-term solution.
> 
> OK, I'll go with that.
> 
> > In general, how do you want to
> > define the residual sum of squares for a complex-valued response?
> > Should it be the sum of the modulus of the residuals?
> 
> My thinking was that the `obvious' choice is the sum of the
> SQUARES of the modulus of the residuals. Using the modulus itself
> would be similar to using the absolute value of the residuals in the
> non-complex case (not what we want I believe).

Sum of squares is common, and ises for complex linear regression in those
cases where it makes sense (e.g. in time series regression one
FFT-ed series on others).

> Another way of looking at it would be to think of extending the
> residual from a scalar to a vector -- the natural extension of the
> `square' operation is to square each element in the vector.
> The current R system already uses this as the natural extension
> to multiply and everyone seems happy with that:
> 
> > square <- function( x ) { x * x }
> > square( -3 )
> [1] 9
> > square( -3:4 )
> [1]  9  4  1  0  1  4  9 16
> 
> So once you have squared all the elements it makes sense to sum them
> up, just like you do with the residuals in any case. Then you get
> support for vector square residuals as being the same as the dot product
> of the vector with itself (i.e. the only way to multiply a vector
> with itself to get a scalar).
> 
> Keeping the logic going, a complex number is sort of like a special
> case of a two element vector (yeah I know they aren't exactly the
> same) so squaring the real and squaring the imaginary parts and summing
> them up would seem consistent. But then Pythagoras points out that
> this is the same as the square of the modulus in any case,
> a full circle has been drawn.
> 
> > We could put complex-valued nonlinear least squares on the wishlist
> > but I would have to warn you that there are many other items in front
> > of it.
> 
> wish wish wish...
> 
> Now I think of it, there are two steps in this wish.
> The first is to allow complex numbers to be allowed in the model
> calculation at all and thus allow them to be residuals.
> 
> The second is to think about whether the parameters should be
> allowed to be complex. I would argue that this second one is not
> a good idea because it is rather easy to convert two real valued 
> model parameters into a single complex number -- this ensures that
> each model parameter is exactly one degree of freedom.
> 
> On the other hand, it is kind of painful to expand out complex
> calculations into their real equivalents, and frustrating too when
> there is a perfectly good complex calculator so close, yet so far away.
> 
> In either case, the documentation should mention that complex
> numbers are not supported.

Why should documentation mention everything the function does not do? There
are *lots* of places in R where numeric quantities must be real.  Linear
regression (lm) for a start.  What nls says is

 formula: a nonlinear model formula including variables and parameters

which is none too explicit, but the reference (Bates & Watts) is,
so I assume you did not read it.

A pretty good reasonable assumption is that statistics does not work
in complexe numbers unless explicitly stated.  That's why in R

> sqrt(-1)
[1] NaN
Warning message: 
NaNs produced in: sqrt(-1) 


-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._