[R] nls & fitting

Mon May 22 09:57:26 CEST 2006

Lorenzo Isella wrote:
> Dear All,
> I may look ridiculous, but I am puzzled at the behavior of the nls with
> a fitting I am currently dealing with.
> My data are:
> 
>        x         N
> 1   346.4102 145.428256
> 2   447.2136 169.530634
> 3   570.0877 144.081627
> 4   721.1103 106.363316
> 5   894.4272 130.390552
> 6  1264.9111  36.727069
> 7  1788.8544  52.848587
> 8  2449.4897  25.128742
> 9  3464.1016   7.531766
> 10 4472.1360   8.827367
> 11 6123.7244   6.600603
> 12 8660.2540   4.083339
> 
> I would like to fit N as a function of x according to a function
> depending on 9 parameters (A1,A2,A3,mu1,mu2,mu3,myvar1,myvar2,myvar3),
> namely
> N ~
> (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
> 
> +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
> 
> +log(10)*A3/sqrt(2*pi)/log(myvar3)*exp(-((log(x/mu3))^2)/2/log(myvar3)/log(myvar3)))
> 
> (i.e. N is to be seen as a sum of three "bells" whose parameters I need
> to determine).
> 
> 
> So I tried:
> out<-nls(N ~ 
> (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
> 
> +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
> 
> +log(10)*A3/sqrt(2*pi)/log(myvar3)*exp(-((log(x/mu3))^2)/2/log(myvar3)/log(myvar3)))
>  ,start=list(A1 = 85,
> A2=23,A3=4,mu1=430,mu2=1670,mu3=4900,myvar1=1.59,myvar2=1.5,myvar3=1.5  )
> ,algorithm = "port"
> ,control=list(maxiter=20000,tol=10000)
> ,lower=c(A1=0.1,A2=0.1,A3=0.1,mu1=0.1,mu2=0.1,mu3=0.1,myvar1=0.1,myvar2=0.1,myvar3=0.1)
> )
> 
> getting the error message:
> Error in nls(N ~ (log(10) * A1/sqrt(2 * pi)/log(myvar1) *
> exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1)) +  :
>         Convergence failure: singular convergence (7)
> 
> 
> I tried to adjust tol & maxiter, but unsuccessfully.
> If I try fitting N with only two "bells", then nls works:
> 
> out<-nls(N ~ 
> (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
> 
> +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
>         )
>  ,start=list(A1 = 85, A2=23,mu1=430,mu2=1670,myvar1=1.59,myvar2=1.5  )
> ,algorithm = "port"
> ,control=list(maxiter=20000,tol=10000)
> ,lower=c(A1=0.1,A2=0.1,mu1=0.1,mu2=0.1,myvar1=0.1,myvar2=0.1)
> )
> 
>  out
> Nonlinear regression model
>   model:  N ~ (log(10) * A1/sqrt(2 * pi)/log(myvar1) *
> exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1)) +      log(10) *
> A2/sqrt(2 * pi)/log(myvar2) *
> exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2)))
>    data:  parent.frame()
>         A1         A2        mu1        mu2     myvar1     myvar2
>  84.920085  40.889968 409.656404 933.081936   1.811560   2.389215
>  residual sum-of-squares:  2394.876
> 
> Any idea about how to get nls working with the whole model?
> I had better luck with the nls.lm package, but it does not allow to
> introduce any constrain on my fitting parameters.
> I was also suggested to try other packages like optim to do the same
> fitting, but I am a bit unsure about how to set up the problem.
> Any suggestions? BTW, I am working with R Version 2.2.1
> 
> Lorenzo
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

apart from the fact that fitting 9 parameters to 12 points quite 
genereally will not yield satisfactory results (at least estimates will 
have huge uncertainties), total failure in your case seems obviouus if 
you plot your data: it's not even obvious where the three peaks (means 
of the gaussians) should be: all below x=2000 or is there one peak at 
about x=4500 and one of the 'peaks' below x=2000 is spurious? if you 
can't decide, nls has problems. moreover: how should reliable estimates 
of the standard deviations (width) of the gaussian result if the peaks 
essentially consist of exactly one point?

in short: I believe, you either  need much more data points or you must 
put in substantial a priori information (e.g. either means or standard 
deviations of the gaussians).