[R] Non-linear curve fitting (nls): starting point and quality of fit
Ben Bolker
bbolker at gmail.com
Mon Jun 4 21:31:13 CEST 2012
Nerak <nerak.t <at> hotmail.com> writes:
>
> Hi all,
>
> Like a lot of people I noticed that I get different results when I use nls
> in R compared to the exponential fit in excel. A bit annoying because often
> the R^2 is higher in excel but when I'm reading the different topics on this
> forum I kind of understand that using R is better than excel?
>
> (I don't really understand how the difference occurs, but I understand that
> there is a different way in fitting, in excel a single value can make the
> difference, in R it looks at the whole function? I read this: "Fitting a
> function is an approximation, trying to find a minimum. Think of frozen
> mountain lake surrounded by mountains. Excel's Solver will report the
> highest tip of the snowflake on the lake, if it finds it. nls will find out
> that the lake is essentially flat compare to the surrounding and tell you
> this fact in unkind word." )
Snarky, but I like it.
Two alternatives to nls are (1) Gabor Grothendieck's nls2 package:
nls2 is an R package that adds the "brute-force" algorithm and
multiple starting values to the R nls function. nls2 is free
software licensed under the GPL and available from CRAN. It
provides a function, nls2, which is a superset of the R nls
function which it, in turn, calls.
Or John Nash's nlmrt package https://r-forge.r-project.org/R/?group_id=395 :
nlmrt provides tools for working with nonlinear least squares
problems using a calling structure similar to, but much
simpler than, that of the nls() function. Moreover, where
nls() specifically does NOT deal with small or zero residual
problems, nlmrt is quite happy to solve them. It also attempts
to be more robust in finding solutions, thereby avoiding
singular gradient messages that arise in the Gauss-Newton
method within nls(). The Marquardt-Nash approach in nlmrt
generally works more reliably to get a solution, though this
may be one of a set of possibilities, and may also be
statistically unsatisfactory.
> I have several questions about nls:
>
> 1. The nls method doesn't give an R^2. But I want to determine the quality
> of the fit. To understand how to use nls I read "Technical note: Curve
> fitting with the R environment for Statistical Computing". In that document
> they suggested this to calculate R^2:
>
> RSS.p<-sum(residuals(fit)^2)
> TSS<-sum((y-mean(y))^2)
> r.squared<-1-(RSS.p/TSS)
> LIST.rsq<-r.squared
>
> (with fit my results of the nls: formula y ~ exp.f(x, a, b) : y :
> a*exp(-b*x))
>
> While I was reading on the internet to find a possible reason why I get
> different results using R and excel, I also read lots of different things
> about the "R^2 problem" in nls.
>
> Is the method I'm using now ok, or should someone suggest to use something
> else?
You could use the residual sum of squares as the quality of the fit:
(i.e. RSS.p above). If you want a _unitless_ metric of the quality
of the fit, I'm not sure what you should do.
> 2. Another question I have is like a lot of people about the singular
> gradient problem. I didn't know the best way to chose my starting values for
> my coefficients. when it was too low, I got this singular gradient error.
> Raising the value helped me to get rid of that error. Changing that value
> didn't change my coefficients nor R^2. I was wondering if that's ok, just to
> raise the starting value of one of my coefficients?
[snip]
If you can find a set of starting coefficients that give you
a sensible fit to the data without any convergence warnings, you
shouldn't worry that other sets of starting coefficients that
*don't* work also exist.
More information about the R-help
mailing list