[R] Collinearity in nls problem

Fri Mar 3 02:58:07 CET 2006

	  Trying different parameterizations is often a wise with nonlinear 
regression.  However, I know of no general rule for finding a good one 
other than to try several and try to fit a paraboloid to the sums  of 
squares surface in a region of the least squares solution:  The best 
parameterization will be fairly close to parabolic.  To do this, I've 
used "expand.grid" to get the points, then chop of all points with sums 
of squares exceeding the minimum plus some number that should represent, 
say, a joint 99% confidence region.  I also supplement this with contour 
or perspective plots:  Parameterizations with the better R^2's usually 
also have a more elliptical appearance in contour plots.  I've done this 
successfully to find a parameterization that will both speed up 
estimation AND provide reasonable accuracy with Wald approximate 
confidence intervals.

	  Even without that, however, we can still get good, joint confidence 
regions in the form of contour plots of the sums of squares surface: 
The validity of these confidence regions is only affected by the 
intrinsic curvature of the problem, and is not affected by the 
parameterization.  Of course, if we select a strange parameterization, 
our confidence regions will not look very elliptical (and our univariate 
confidence intervals may be far from symmetric).

	  My favorite reference for this kind of thing is Bates and Watts 
(1988) Nonlinear Regression Analysis and Its Applications (Wiley).

	  hope this helps.
	  spencer graves

Simon Frost wrote:

> Dear R-Help list,
> 
> I have a nonlinear least squares problem, which involves a changepoint;
> at the beginning, the outcome y is constant, and after a delay, t0, y
> follows a biexponential decay. I log-transform the data, to stabilize
> the error variance. At time t < t0, my model is
> 
> log(y_i)=log(exp(a0)+exp(b0))
> 
> at time t >= t0, the model is
> 
> log(y_i)=log(exp(a0-a1*(t_i - t0))+exp(b0=b1*(t_i - t0)))
> 
> I thought that I would have identifiability issues, but this model seems
> to work fine except that the parameters t0 (the delay) is highly
> correlated with the initial decay slope a0 (which makes sense, as the
> longer the delay, the more rapid the drop has to be, conditional on the
> data).
> 
> To get over this problem, I could reparameterize the problem, but it
> isn't clear to me how to do this for the above model. I also thought
> about using a penalized least square approach, to shrink t0 and a1
> towards 0. I haven't seen much on penalized least squares in a nonlinear
> least squares setting; is this a good way to go? Can I justifiably
> penalize only a0 and a1, or should I also penalize the other parameters?
> 
> Thanks for any help!
> Simon