[R] Complicated nls formula giving singular gradient message
Phil Spector
spector at stat.berkeley.edu
Tue Dec 14 00:38:26 CET 2010
Jared -
Actually I didn't realize that nls would accept a formula
until I tried my simple example in reaction to your post :-)
I don't recall nls() ever reporting the cause of the singular
gradient as being bad starting values -- I know I spend a lot
of time in my lectures on non-linear regression emphasizing that
bad starting values are the usual culprit when the dreaded
"singular gradient" message rears its ugly head.
I think your function has a fairly large "flat" region, wherein
changes in some of the parameters don't really effect the residual
sums of squares that much. I think you can visualize it like this:
> therss = function(NS,LogKi,BMax)sum((dat$Y - myfunc(NS,LogKi,BMax))^2)
> tst = expand.grid(NS=seq(.007,.009,by=.0005),
+ LogKi=seq(-9.5,-8.5,by=.05),
+ BMax =seq(1.8e05,2.8e05,by=.1e05))
> tst$rss = apply(tst,1,function(x)therss(x[1],x[2],x[3]))
> library(lattice)
> wireframe(rss~BMax+NS|factor(LogKi),data=tst,as.table=TRUE)
If you look at the panels where LogKi is around -8.9 (the reported
maximum), the residual-sums-of-squares surface is pretty flat.
I think you can also see regions where there isn't much change in
the residual sums of squares in this plot:
> wireframe(rss~LogKi+BMax|factor(NS),data=tst,as.table=TRUE)
I also ran your data through proc nlp in sas (I know there are a lot of
SAS-bashers on this list, but I worked there many years ago and I know
the quality of their software), and got the following results:
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Estimate Function
1 NS 0.006766 -0.121333
2 LogKi -8.966402 -0.000509
3 BMax 237013 1.109368E-11
The message that nlp reported was
NOTE: At least one element of the (projected) gradient is greater than 1e-3.
Finally, I ran the the same model and data using nlfit in matlab, with all
values set to their defaults. It reported the following without warning:
ans =
1.0e+05 *
0.000000086522054 -0.000089870065555 2.371354822440646
which agrees almost exactly with R.
Hope this helps.
- Phil
On Mon, 13 Dec 2010, Jared Blashka wrote:
> Phil,
> This is great! I had no idea nls would accept functions in the formula
> position. My apologies for not including data to reproduce my issue.
>
> dat<-data.frame(X=c(-13.0,-11.0,-10.0,-9.5,-9.0,-8.5,-8.0,-7.5,-7.0,-6.5,-6.
> 0,-5.0,
> -13.0,-11.0,-10.0,-9.5,-9.0,-8.5,-8.0,-7.5,-7.0,-6.5,-6.0,-5.0),
> Y=c(3146,3321,2773,2415,2183,1091,514,191,109,65,54,50,
> 3288,3243,2826,2532,2060,896,517,275,164,106,202,53))
>
> With your suggestion, I've changed the formula in nls to the following
> function:
>
> myfunc<-function(NS,LogKi,BMax)with(dat,{
> KdCPM = KdnM*SpAct*Vol*1000
> R<-NS+1
> S<-(1+10^(X-LogKi))*KdCPM+Hot
> a<-(-1*R)
> b<-R*S+NS*Hot+BMax
> c<--1*Hot*(S*NS+BMax)
> (-1*b+sqrt(b*b-4*a*c))/(2*a)
> })
>
> But to get it to compute without errors, I also had to increase the tolerance
> level: the step factor keeps being reduced below the min factor. Looking at
> the trace of the nls though, I don't see any changes after the 10th iteration
> or so; would increasing the tolerance cause any issue that I'm not thinking
> of?
>
> KdnM <- .8687
> SpAct <- 4884
> Vol <- .125
> Hot <- 10191.0
> nls(Y~myfunc(NS,LogKi,BMax),data=dat,start=list(NS=.01,LogKi=-7,BMax=10*max(
> dat['Y'])),control=nls.control(warnOnly=TRUE,minFactor=1e-5),trace=TRUE)
>
> Also, I've found that if the start value I provide for BMax is too inaccurate
> (ex. max(dat['Y']), nls generates the 'singular gradient' error message,
> which isn't something I'm used to. Usually nls is kind enough to inform me
> that the initial parameter estimates are what caused the problem. Has the
> error message changed in a recent update, or is this a different error
> message than what I'm thinking about?
>
> Thanks again for all your help,
> Jared
>
> On Mon, Dec 13, 2010 at 1:23 PM, Phil Spector <spector at stat.berkeley.edu>
> wrote:
> Jared -
> nls will happily accept a function on the right hand side
> of the ~ -- you don't have to write out the formula in such
> detail.
> What you provided isn't reproducible because you didn't provide
> data, and it's not clear what "Y" in the formula
> represents. Let me provide you with an admittedly simpler
> reproducible example.
>
> Suppose we want to estimate the model
>
> response = v * dose / (k + dose)
>
> where response and dose are variables in a data frame called
> "dat",
> and v and k are the parameters to be estimated.
>
> Here's the data:
>
> dat =
> data.frame(dose=c(0.027,0.044,0.073,0.102,0.175,0.257,0.483,0.670),
>
> + response=c(12.7,16.0,20.4,22.3,26.0,28.2,29.6,31.4))
>
> Normally we would fit such a simple model as
>
> nls(response ~ v*dose / (k +
> dose),data=dat,start=list(v=30,k=.05))
>
> Nonlinear regression model
> model: response ~ v * dose/(k + dose)
> data: dat
> v k 32.94965 0.04568
> residual sum-of-squares: 1.091
>
> Number of iterations to convergence: 4 Achieved convergence
> tolerance: 8.839e-07
>
> Alternatively, I can write a function like this:
>
> myfunc = function(v,k)with(dat,v * dose / (k + dose))
>
>
> and use the following call to nls:
>
> nls(response ~
> myfunc(v,k),data=dat,start=list(v=30,k=.05))
>
> Nonlinear regression model
> model: response ~ myfunc(v, k)
> data: dat
> v k 32.94965 0.04568
> residual sum-of-squares: 1.091
>
> Number of iterations to convergence: 4 Achieved convergence
> tolerance: 8.839e-07
>
> which gets the identical results.
>
> Admittedly this function is trivial, but perhaps in your case
> you could use the segments from prism to construct a function
> for the right-hand side of your nls call.
>
> Hope this helps.
> - Phil Spector
> Statistical Computing
> Facility
> Department of Statistics
> UC Berkeley
> spector at stat.berkeley.edu
>
>
>
>
>
>
> On Mon, 13 Dec 2010, Jared Blashka wrote:
>
> I'm attempting to calculate a regression in R that I normally use
> Prism for,
> because the formula isn't pretty by any means.
>
> Prism presents the formula (which is in the Prism equation
> library as
> Heterologous competition with depletion, if anyone is curious) in
> these
> segments:
>
> KdCPM = KdnM*SpAct*Vol*1000
> R=NS+1
> S=(1+10^(X-LogKi))*KdCPM+Hot
> a=-1*R
> b=R*S+NS*Hot+BMax
> c = -1*Hot*(S*MS+BMax)
> Y = (-1*b+sqrt(b*b-4*a*c))/(2*a)
>
> I'm only trying to solve for NS, LogKi, and BMax. I have
> everything else
> (KdnM, SpAct, Vol, Hot).
>
> I would use the simple formula at the bottom and then backsolve
> for the
> terms I'm looking for, but the simple formula at the bottom takes
> out the X
> term, which is contained within S, which it itself contained in
> both b and
> c.
> So I tried to substitute all the terms back into Y and got the
> following
>
> formula<-as.formula("Y ~
> (-1*(((NS+1)*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)+sqrt
> ((((NS+1)*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)*(((NS+1
> )*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)-4*(-1*(NS+1))*(
> -1*Hot*(((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot)*NS+BMax))))/(2*-1*(NS+1
> ))")
>
> But trying to use that formula gives me the single gradient
> message, which I
> wasn't entirely surprised by.
> fit<-nls(formula=formula,data=data,start=list(NS=.01,LogKi=-7,BMax=33000))
> Error in nls(formula = formula, data = all_no_outliers, start =
> list(NS =
> 0.01, :
> singular gradient
>
> I've never worked with a formula this complicated in R. Is it
> even possible
> to do something like this? Any ideas or points in the right
> direction would
> be greatly appreciated.
>
> Thanks,
> Jared
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible
> code.
>
>
>
>
More information about the R-help
mailing list