[R] Non-linear Regression best-fit line
peter dalgaard
pdalgd at gmail.com
Sat Jun 18 11:05:53 CEST 2011
On Jun 17, 2011, at 23:14 , Sean Bignami wrote:
> I am trying to fit a curve to a cumulative mortality curve (logistic) where y is the cumulative proportion of mortalities, and t is the time in hours (see below). Asym. at 0 and 1
>> y
> [1] 0.00000000 0.04853859 0.08303777 0.15201970 0.40995074 0.46444992 0.62862069 0.95885057 1.00000000
> [10] 1.00000000 1.00000000
>> t
> [1] 0 13 20 24 37 42 48 61 72 86 90
>
> I tried to find starting values for a Gompertz non-linear regression by converting the equation (y~1*exp(-beta*exp(-gamma*t)) to a linear form per Dalgaard "Introductory Statistics with R" pg.279-280. But got this Error message:
>
>> lm(log(0-log(y))~t)
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
> NA/NaN/Inf in foreign function call (arg 4)
>
> I tried to change all by 0 and 1 values to non-zero and non-one values (yy and tt below), and was able to get starting estimates.
>
>> yy<-c(0.000000001,0.04853859, 0.08303777, 0.15201970, 0.40995074, 0.46444992, 0.62862069, 0.95885057, 0.9999999999,0.999999999999,0.999999999999)
>> tt<-c(0.0000000001,13,20,24,37,42,48,61,72,86,90)
>
>> lm(log(0-log(yy))~tt)
>
> Call:
> lm(formula = log(0 - log(yy)) ~ tt)
>
> Coefficients:
> (Intercept) tt
> 9.5029 -0.3681
>
> However, when I plug those values into the nls function, I get an error message about the "getInitial" method
>
>> nlsout<-nls(y~1*exp(-beta*exp(-gamma*t),start=c(beta=exp(9.5),gamma=.368)));summary(nlsout)
> Error in getInitial.default(func, data, mCall = as.list(match.call(func, :
> no 'getInitial' method found for "function" objects
Not really, but getting your parentheses right on the nls call should at least get you closer. There are 3 "(" before "start" but only 1 ")", so the "start" bit is part of the formula, etc.
Trying that, I got
> nlsout<-nls(yy~1*exp(-beta*exp(-gamma*tt)),start=list(beta=exp(9.5),gamma=.368))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
so you still have some way to go. (I've been using the yy/tt variables because they were easier to paste in from your post. Shouldn't matter much.)
Offhand, I'd say that your procedure for finding starting values is still too sensitive to the zeros and ones. If you do
plot(tt, log(-log(yy)) )
you will see that the last three points (the 1s in your original data) fall way off the linear trend. So how about omitting them rather than putting in an arbitrary value? And get rid of the zero too.
> lm(log(-log(yy))~tt,subset=2:8)
Call:
lm(formula = log(-log(yy)) ~ tt, subset = 2:8)
Coefficients:
(Intercept) tt
2.5870 -0.0807
> nlsout<-nls(yy~1*exp(-beta*exp(-gamma*tt)),start=list(beta=exp(2.5),gamma=.08))
> summary(nlsout)
Formula: yy ~ 1 * exp(-beta * exp(-gamma * tt))
Parameters:
Estimate Std. Error t value Pr(>|t|)
beta 13.14634 4.61767 2.85 0.019 *
gamma 0.07284 0.00869 8.38 1.5e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.0568 on 9 degrees of freedom
Number of iterations to convergence: 10
Achieved convergence tolerance: 4.29e-06
--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list