[R] Non-linear curve fitting (nls): starting point and quality of fit

Mon Jun 4 14:19:01 CEST 2012

Hi all,

Like a lot of people I noticed that I get different results when I use nls
in R compared to the exponential fit in excel. A bit annoying because often
the R^2 is higher in excel but when I'm reading the different topics on this
forum I kind of understand that using R is better than excel?

 (I don't really understand how the difference occurs, but I understand that
there is a different way in fitting, in excel a single value can make the
difference, in R it looks at the whole function? I read this: "Fitting a
function is an approximation, trying to find a minimum. Think of frozen
mountain lake surrounded by mountains. Excel's Solver will report the
highest tip of the snowflake on the lake, if it finds it. nls will find out
that the lake is essentially flat compare to the surrounding and tell you
this fact in unkind word." )

I have several questions about nls:

1. The nls method doesn't give an R^2. But I want to determine the quality
of the fit. To understand how to use nls I read "Technical note: Curve
fitting with the R environment for Statistical Computing". In that document
they suggested this to calculate R^2:

RSS.p<-sum(residuals(fit)^2)
 TSS<-sum((y-mean(y))^2)
 r.squared<-1-(RSS.p/TSS)
 LIST.rsq<-r.squared

(with fit my results of the nls: formula y ~ exp.f(x, a, b) : y :
a*exp(-b*x))

While I was reading on the internet to find a possible reason why I get
different results using R and excel, I also read lots of different things
about the "R^2 problem" in nls.

Is the method I'm using now ok, or should someone suggest to use something
else?

2. Another question I have is like a lot of people about the singular
gradient problem. I didn't know the best way to chose my starting values for
my coefficients. when it was too low, I got this singular gradient error.
Raising the value helped me to get rid of that error. Changing that value
didn't change my coefficients nor R^2. I was wondering if that's ok, just to
raise the starting value of one of my coefficients? 

The only things that change are the Achieved convergence tolerance and
number of iterations to convergence. P values, residual standard error and
the coefficients have always exactly the same results. What does the
achieved convergence tolerance actually mean? What are its implications? (I
suppose the time to calculate it changes)

(the most useful information about nls and singular gradient error i found
is this one (and that's why I started playing with changing the starting
values):
" if the estimate of the rank that results is less than the number of
columns in the gradient (the number of nonlinear parameters), or less than
the number of rows (the number of observations), nls stops.")

I hope someone can help me with this questions. I would like to know what's
happening and not just having to accept the results I get now :).

Kind regards,

Nerak

--
View this message in context: http://r.789695.n4.nabble.com/Non-linear-curve-fitting-nls-starting-point-and-quality-of-fit-tp4632295.html
Sent from the R help mailing list archive at Nabble.com.