[R] Odp: finding best fitting model

Fri Apr 3 11:28:34 CEST 2009

Hi

I adopted an idea from Chemical Engineering  where one guy used set of 
functions to check which can fit data well and used Excel for it. So I 
tried if I can do it in R.

It is very rough and needs data to be a two column data frame with x in 
the first column together with list of formulas.

You can use it like

nls.find(modely1, data)

and you can try to give it some starting values but it can arrive to 
results without them. You can also set start=T, which will use default 
method for nls. The result is list of model parameters ordered by residual 
sum of residual sqares. It does not help you to think about your data, it 
does not use any type of diagnostics. 

But you can use it with some care.

regards
Petr

Below is three set of models and the function itself.

"modely1" <- list(
y~1/(a-x),
y~x^2-a*x,
y~a^x,
y~1-a^x,
y~a^(1/x),
y~x^a,
y~1-1/(x^a),
y~a^(1-log(x)),
y~1-exp(-a*x),
y~a*x^2-log(x),
y~a/cosh(x),
y~a*cosh(x/a))

"modely2" <- list(
y~a*x+b,
y~1/(a+b*x),
y~a/(b+x),
y~a+b*log(x),
y~a*exp(b*x),
y~a*exp(b/x),
y~a*(1-exp(b*x)),
y~(a/(a-b))*(exp(-b*x)-exp(-a*x)),
y~1/(1+exp(-a*(x-b))),
y~a*x^b,
y~a*(1+x)^b,
y~(a+x)^b,
y~(a-x)^b,
y~1-1/(1+a*x)^b,
y~a*x^(b*x),
y~a*x^(b/x),
y~x/(a+b*x-(a+b)*x^2),
y~a*cos(x-b),
y~a/(sin(x-b))
)

"modely3" <- list(
y~a+b*x+c*x^2,
y~1/(a+b*x+c*x^2),
y~a+b/x+c/(x^2),
y~a+b*log(x-c),
y~a+b*log(1+c^x),
y~a+b*x^c,
y~a*(x-b)^c,
y~a*(x/b)^c,
y~a*b^x+x^c,
y~a*x^(b*x-c),
y~(a*x^c)/(1+b*x^c),
y~(a*x^c)/(b^c+x^c),
y~a+b*exp(c*x),
y~a-b*exp(-c*x),
y~a*(1-b*exp(c*x)),
y~a*(1-exp(-b*x))^c,
y~a+(b*(exp(c*x)-1))/c,
y~exp(a+b*x+c*x^2),
y~a/(1+exp(-b*(x-c))),
y~a*sin((pi*(x-b))/c)
)

nls.find <- function(formula, data, start=NULL, ...){
ll <- length(formula)
result2 <- vector("list", ll)
result1 <- rep(NA, ll)
names(data) <- c("x","y")

for(i in 1:ll) {

if (is.null(start)) {
n <- length(all.vars(formula[[i]]))-2
start <- as.list(rep(mean(data[,2], na.rm=T),n))
names(start) <- letters[1:n]
fit<-try(nls(formula[[i]], data, start=start, ...))
} else {

if (is.logical(start)) {
fit<-try(nls(formula[[i]], data, ...))
} else {
fit<-try(nls(formula[[i]], data, start, ...)) }
}

if( class(fit)=="try-error") result1[i] <- NA  else result1[i] <- 
sum(resid(fit)^2, na.rm=T)
if( class(fit)=="try-error") result2[[i]] <- NA  else result2[[i]] <- 
coef(fit)
}
ooo<-order(result1)
result <- mapply(c, "sq.resid" = result1, result2, SIMPLIFY=FALSE)
names(result) <- paste(1:ll, as.character(formula), sep="/")
result[ooo]
}

Petr Pikal
petr.pikal at precheza.cz
724008364, 581252140, 581252257

r-help-bounces at r-project.org napsal dne 02.04.2009 18:40:43:

> Hello,
> Is there a function in r to find the best fitting model for a set of 
data?
> I would like to know if my data are related exponentially,linearly or if 
there
> is a logarithmic correlation between my x and y values.
> To get a better imagination I've added the graphics at the end of this 
mail
> as an attachment.
> Thanks,  Ben
> -- 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.