# [R] Finding starting values for the parameters using nls() or nls2()

dave fournier davef at otter-rsch.com
Mon Oct 24 19:49:30 CEST 2016

```I've spent quite a bit of time trying to convince people on various
lists that the solution to these kinds of
problems lies in the stable parameterization of the model.  I write the
solutions in AD Model Builder because it
is easy.  But R people are generally stuck in R (or mired) so the
message somehow always gets lost.

I thought I would bite the bullet and figure out how to do this in R.
It turns out that one can fit this model
with only 6 function evaluations using nls2.   I apologize in advance
for my execrable R code. But it does do
the job.

The data were fit using 3 calls to nls2. For the first call only the
parameter th was estimated. This converged
in 4 function evaluations. For the second call all three of the
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged
to the solution in four function evaluations.

The final call to nls2 uses the converged values to calculate b0,b1 and
theta and starts from there.
As you can see the model is already converged. This final call to nls2
is used to get the standard error
estimates for the parameters.

>  nls.m1
Nonlinear regression model
model: Retention ~ expFct1(Area, y1, yn, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
th
0.9862
residual sum-of-squares: 730

Number of iterations to convergence: 2
Achieved convergence tolerance: 1.616e-06

>  nls.m2
Nonlinear regression model
model: Retention ~ expFct2(Area, y1, yn, c, d, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
c      d     th
0.9716 1.1010 0.9118
residual sum-of-squares: 686.8

Number of iterations to convergence: 4
Achieved convergence tolerance: 2.398e-06

>  nls.m
Nonlinear regression model
model: Retention ~ expFct(Area, b0, b1, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
b0         b1         th
5.2104452 -0.0004672  0.9118040
residual sum-of-squares: 686.8

Number of iterations to convergence: 0
Achieved convergence tolerance: 1.384e-06

Parameters:
Estimate Std. Error t value Pr(>|t|)
b0  5.2104452  0.4999594  10.422 2.29e-07 ***
b1 -0.0004672  0.0013527  -0.345   0.7358
th  0.9118040  0.3583575   2.544   0.0257 *

So what is the stable parameterization for this model.  To simplify let
x be the independent variable and y be the
dependent variable and write t instead of th  So the model is

y = exp(b0*exp(b1*x^t) (1)

Now it would be nice to reorder the x's so that they are monotone
increasing or decreasing, but in this case the
first x is almost the largest and the last x is almost the smallest
(slight reach) so they will do. Call them x1 and xn.
The new parameters of the model are the predicted y's for x1 and xn.
Call them y1 and yn.  Note that y1 and yn
are parameters, not observations.

The data were fit using 3 calls to nls2. For the first call only the
parameter th was estimated. this converged
in 4 function evaluestions. for the second call all three of the
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged
to the solution in four function evaluations.

The final call to nls2 uses the converged values to calculate b0,b1 and
theta and starts from there.
as you can see the model is already converged. this final call to nls2
is used to get the standard error
estimates for the parameters.

>  nls.m1
Nonlinear regression model
model: Retention ~ expFct1(Area, y1, yn, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
th
0.9862
residual sum-of-squares: 730

Number of iterations to convergence: 2
Achieved convergence tolerance: 1.616e-06

>  nls.m2
Nonlinear regression model
model: Retention ~ expFct2(Area, y1, yn, c, d, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
c      d     th
0.9716 1.1010 0.9118
residual sum-of-squares: 686.8

Number of iterations to convergence: 4
Achieved convergence tolerance: 2.398e-06

>  nls.m
Nonlinear regression model
model: Retention ~ expFct(Area, b0, b1, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
b0         b1         th
5.2104452 -0.0004672  0.9118040
residual sum-of-squares: 686.8

Number of iterations to convergence: 0
Achieved convergence tolerance: 1.384e-06

Parameters:
Estimate Std. Error t value Pr(>|t|)
b0  5.2104452  0.4999594  10.422 2.29e-07 ***
b1 -0.0004672  0.0013527  -0.345   0.7358
th  0.9118040  0.3583575   2.544   0.0257 *

So what is the stable parameterization for this model.  To simplify letx
be the independent variable and y be the
dependent variable and write t insted of th  So the model is

y = exp(b0*exp(b1*x^t) (1)

Now it would be nice to reorder the x's so that they are monotone
increasing or decreasing, but in this case the
first x is almost the largest and the last x is almost the smallest
(slight reach) so they will do. Call them x1 and xn.
The new parameters of the model are the predicted y's for x1 and xn.
Call them y1 and yn.  Note that y1 and yn
are parameters, not observations.

y1 = exp(b0*exp(b1*x1^t)
(2)
yn = exp(b0*exp(b1*xn^t)

One can solve for b1 and b0 in terms of y1 and yn.

b1=log(log(y1)/log(yn))/(x1^t-xn^t);
(3)
b0=log(y1)*exp(-b1*x1^t);

To use this  we do the fitting of the model in two stages. In the first
stage we use the obvious
estimates of y[1] for y1 and y[n] for yn and fix these values and
estimate t using the obvious
starting value of 1 for t.

In the second stage we set

y1=c*y[1]

yn=d*y[n]

and estimate   c, d, and t using the obvious starting values of 1 for c
and d.
That's it. this converges as stated in 6 function evaluations.  This
technique works for
may curves such as 3,4,5 parameter logistic double exponential
vonBertalanffy
(where Schnute and I first discovered it in the early 1980's before
anyone was born).

One caveat  is that usually not all values of y1 and yn are
permissible.  For example in this case if b>0
then y1 and yn are both >1. To make this routine bulletproof one needs
to impose these kinds of
constraints.  But this was not needed here.

Here is my  R code for this model.

library("nls2")
cl=data.frame(Area=c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91,
989.05,
1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),
Retention=c (95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
9.72, 97.92, 71.44, 44.52, 24.44, 15.26) )

y1 <<- cl\$Retention[1];
yn <<- cl\$Retention[length(cl\$Retention)]

expFct1 <- function(x, y1, yn,th)
{
x1=x[1]
xn=x[length(x)]
b1=log(log(y1)/log(yn))/(x1^th-xn^th)
b0=log(y1)*exp(-b1*x1^th)
exp(b0*exp(b1*x^th));
}

expFct2 <- function(x, y_1,y_n,c,d,th)
{
x1=x[1]
xn=x[length(x)]
y1=c*y_1
yn=d*y_n
b1=log(log(y1)/log(yn))/(x1^th-xn^th)
b0=log(y1)*exp(-b1*x1^th)
exp(b0*exp(b1*x^th));
}

expFct <- function(x, b0, b1,th)
{
exp(b0*exp(b1*x^th))
}

nls.m1<- nls2(Retention ~ expFct1(Area, y1, yn, th), data = cl,
start = list(th = 1))

th_start=coef(nls.m1)

nls.m2<- nls2(Retention ~ expFct2(Area, y1, yn, c, d,  th), data = cl,
start = list(c=1, d=1, th = th_start))

x=cl\$Area
x1=x[1]
xn=x[length(x)]
th_start=coef(nls.m2)[3]
cy1=coef(nls.m2)[1]*y1

dyn=coef(nls.m2)[2]*yn
b1_start=log(log(cy1)/log(dyn))/(x1^th_start-xn^th_start)
b0_start=log(cy1)*exp(-b1_start*x1^th_start)

th_start
b1_start
b0_start
nls.m<- nls2(Retention ~ expFct(Area, b0, b1, th), data = cl,
start = list(b0 = b0_start, b1 = b1_start, th = th_start))

nls.m1
nls.m2
nls.m

```