[R] Automatization of non-linear regression
Douglas Bates
bates at stat.wisc.edu
Thu Oct 22 23:43:24 CEST 2009
Alternatively you could install the NRAIA package and collapse the
construction of the plot to a call to plotfit as shown in the
enclosed.
Note that this is a poor fit of a logistic growth model. There are no
data values beyond the inflection point so the Asym parameter (the
asymptote) is poorly determined and the asymptote and the inflection
point on the x axis (the xmid parameter) will be highly correlated.
Furthermore the increasing variability is due to differences between
plants. You can check with
xyplot(severity ~ time, data1, groups = plant, type = c("g", "b"))
so a nonlinear mixed-effects model may be more appropriate. See
Pinheiro and Bates (Springer, 2000).
2009/10/22 Peter Ehlers <ehlers at ucalgary.ca>:
> Joel,
>
> The following should answer most of your questions:
>
> time <- c(seq(0,10),seq(0,10),seq(0,10))
> plant <- c(rep(1,11),rep(2,11),rep(3,11))
> severity <- c(
> 42,51,59,64,76,93,106,125,149,171,199,
> 40,49,58,72,84,103,122,138,162,187,209,
> 41,49,57,71,89,112,146,174,218,250,288)/288
>
> # Suggestion: you don't need cbind() to make a dataframe:
>
> data1 <- data.frame(time, plant, severity)
>
> # Plot severity versus time
> # you can avoid the awkward ..$.. by using with():
>
> with(data1, plot(time, severity, type="n"))
> with(data1, text(time, severity, plant))
> title(main="Graph of severity vs time")
>
> # Now you use either the 'getInitial' approach and
> # transform your parameter estimates to your
> # preferred ones, or you can use SSlogis for your
> # model and transform the parameter estimates
> # afterwards:
>
> # Method 1:
> # --------
> ini <- getInitial(
> severity ~ SSlogis(time, alpha, xmid, scale),
> data = data1
> )
> ini <- unname(ini) ##to get rid of names
>
> # start vector in terms of alpha, beta, gamma:
> para0.st <- c(
> alpha = ini[1],
> beta = ini[2]/ini[3],
> gamma = 1/ini[3]
> )
>
> fit0 <- nls(
> severity~alpha/(1+exp(beta-gamma*time)),
> data1,
> start=para0.st,
> trace=T
> )
>
> # logistic curve on the scatter plot
> co <- coef(fit0) ##get the parameter estimates
> curve(co[1]/(1+exp(co[2]-co[3]*x)), add=TRUE)
>
> # you don't need from, to, since the plot is already
> # set up and you don't want to restrict the curve;
>
> # personally, I prefer defining the function to be plotted:
>
> f <- function(x, abc){
> alpha <- abc[1]
> beta <- abc[2]
> gamma <- abc[3]
> alpha / (1 + exp(beta - gamma * x))
> }
>
> # then you can plot the curve with:
>
> curve(f(x, coef(fit0)), add = TRUE)
>
>
> # Method 2:
> # --------
> fit2 <- nls(severity ~ SSlogis(time, Asym, xmid, scal), data = data1)
> co <- unname(coef(fit2))
> abc <- c(co[1], co[2]/co[3], 1/co[3])
> with(data1, plot(time, severity, type="n"))
> with(data1, text(time, severity, plant))
> title(main="Graph of severity vs time")
> curve(f(x, abc), add = TRUE)
>
>
> Cheers,
> -Peter Ehlers
>
> Joel Fürstenberg-Hägg wrote:
>>
>> Hi everybody,
>>
>>
>> I'm using the method described here to make a linear regression:
>>
>>
>>
>> http://www.apsnet.org/education/advancedplantpath/topics/Rmodules/Doc1/05_Nonlinear_regression.html
>>
>>
>>>
>>> ## Input the data that include the variables time, plant ID, and severity
>>> time <- c(seq(0,10),seq(0,10),seq(0,10))
>>> plant <- c(rep(1,11),rep(2,11),rep(3,11))
>>>
>>> ## Severity represents the number of
>>> ## lesions on the leaf surface, standardized
>>> ## as a proportion of the maximum
>>> severity <- c(
>>
>> + 42,51,59,64,76,93,106,125,149,171,199,
>> + 40,49,58,72,84,103,122,138,162,187,209,
>> + 41,49,57,71,89,112,146,174,218,250,288)/288
>>>
>>> data1 <- data.frame(
>>
>> + cbind(
>> + time,
>> + plant,
>> + severity
>> + )
>> + )
>>>
>>> ## Plot severity versus time ## to see the relationship between## the two
>>> variables for each plant
>>> plot(
>>
>> + data1$time,
>> + data1$severity,
>> + xlab="Time",
>> + ylab="Severity",
>> + type="n"
>> + )
>>>
>>> text(
>>
>> + data1$time,
>> + data1$severity,
>> + data1$plant
>> + )
>>>
>>> title(main="Graph of severity vs time")
>>>
>>> getInitial(
>>
>> + severity ~ SSlogis(time, alpha, xmid, scale),
>> + data = data1
>> + )
>> alpha xmid scale 2.212468 12.506960 4.572391
>>>
>>> ## Using the initial parameters above,
>>> ## fit the data with a logistic curve.
>>> para0.st <- c(
>>
>> + alpha=2.212,
>> + beta=12.507/4.572, # beta in our model is xmid/scale
>> + gamma=1/4.572 # gamma (or r) is 1/scale
>> + )
>>>
>>> fit0 <- nls(
>>
>> + severity~alpha/(1+exp(beta-gamma*time)),
>> + data1,
>> + start=para0.st,
>> + trace=T
>> + )
>> 0.1621433 : 2.2120000 2.7355643 0.2187227 0.1621427 : 2.2124095
>> 2.7352979 0.2187056
>>>
>>> ## Plot to see how the model fits the data; plot the
>>> ## logistic curve on a scatter plot
>>> plot(
>>
>> + data1$time,
>> + data1$severity,
>> + type="n"
>> + )
>>>
>>> text(
>>
>> + data1$time,
>> + data1$severity,
>> + data1$plant
>> + )
>>>
>>> title(main="Graph of severity vs time")
>>>
>>> curve(
>>
>> + 2.21/(1+exp(2.74-0.22*x)),
>> + from=time[1],
>> + to=time[11],
>> + add=TRUE
>> + )
>>
>>
>> As you can see I have to do some work manually, such as setting the
>> numbers to be used for calculation of alpha, beta and gamma. I wonder if you
>> might have an idea how to automatize this? I suppose it should be possible
>> to save the output from getInitial() and reach the elements via index or
>> something, but how?
>>
>>
>> I guess a similar approach could be used for the values of fit0?
>>
>>
>> Or even better, if the variables alpha, beta and gamma could be used right
>> away for instance in curve(), instead of adding the values manually. But
>> just exchanging the values with the varables (alpha instead of 2.21 etc)
>> doesn't seem to work. What is the reason for that? Any solution?
>>
>>
>> A last, general but somewhat related question. If I set variables in a
>> function such as para0.st <- c(alpha=2.212, ...), is it just stored locally,
>> or can it be used globally, I mean, can I use the variable anywhere (for
>> instance in curve()) or just in the function where it was created? I'm
>> asking because I'm used to Java, where the life time of local variabels only
>> extends to the closing braces, while global variables can be reached
>> everywhere.
>>
>>
>> The reason for automatization is that I'll have to repeat the procedure
>> more than a hundred times, while making overview pair waise plots of my
>> data, with both this logaritmic regression and several others (exponential,
>> monomoelcular, logistic, Gompertz and Weibull).
>>
>>
>>
>> Wish you all the best,
>>
>>
>> Joel
>>
>> _________________________________________________________________
>> Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig!
>> http://windows.microsoft.com/shop
>> [[alternative HTML version deleted]]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-------------- next part --------------
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
- MKL libraries for acclerated math installed: see help (setMKLthreads)
- ParallelR packages installed: see help (package='foreach')
Type 'revo()' to visit www.revolution-computing.com for the latest
REvolution R news, 'forum()' for the community forum, or 'readme()'
for release notes.
> library(NRAIA)
Loading required package: lattice
> data1 <- data.frame(time = rep(0:10, 3),
+ plant = rep(1:3, each = 11),
+ severity = c(42,51,59,64,76,93,106,125,149,171,199,
+ 40,49,58,72,84,103,122,138,162,187,209,
+ 41,49,57,71,89,112,146,174,218,250,288)/288)
> summary(fit2 <- nls(severity ~ SSlogis(time, Asym, xmid, scal), data1))
Formula: severity ~ SSlogis(time, Asym, xmid, scal)
Parameters:
Estimate Std. Error t value Pr(>|t|)
Asym 2.212 1.993 1.110 0.275813
xmid 12.507 6.929 1.805 0.081100 .
scal 4.572 1.189 3.845 0.000583 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 0.07352 on 30 degrees of freedom
Number of iterations to convergence: 0
Achieved convergence tolerance: 5.513e-07
> plotfit(fit2)
>
> proc.time()
user system elapsed
0.960 0.070 1.007
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf
Type: application/pdf
Size: 7371 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091022/285d4b50/attachment-0002.pdf>
More information about the R-help
mailing list