[R] Optimization in R similar to MS Excel Solver
Berend Hasselman
bhh at xs4all.nl
Sat Oct 20 11:49:00 CEST 2012
I do not know what algorithms the Excel solver function uses.
See inline for how to do what you want in R.
Forgive me if I have misinterpreted your request.
On 19-10-2012, at 16:25, Richard James wrote:
> Dear Colleagues,
> I am attempting to develop an optimization routine for a river suspended
> sediment mixing model. I have Calcium and Magnesium concentrations (%) for
> sediments from 4 potential source areas (Topsoil, Channel Banks, Roads,
> Drains) and I want to work out, based on the suspended sediment calcium and
> magnesium concentrations, what are the optimal contributions from each
> source area to river suspended sediments. The dataset is as follows:
>
> Topsoil Channel Bank
> Roads Drains
> Ca(%) 0.03 0.6
> 0.2 0.35
> Mg(%) 0.0073 0.0102
> 0.0141 0.012
> Contribution 0.25 0.25
> 0.25 0.25
>
> and
>
> Ca in river(%) 0.33
> Mg in river (%) 0.0114
>
Read data (it would have been a lot more helpful if you had provided the result of dput for the data and the target).
basedat <- read.table(text="Topsoil Channel-Bank Roads Drains
Ca(%) 0.03 0.6 0.2 0.35
Mg(%) 0.0073 0.0102 0.0141 0.012
Contribution 0.25 0.25 0.25 0.25", header=TRUE)
basedat
target <- c("Ca in river(%)"=0.33,"Mg in river (%)"=0.0114)
# convert to matrix and vector for later use
bmat <- as.matrix(basedat[1:2,])
pstart <- as.numeric(basedat["Contribution",1:3])
> I want to optimize the contribution values (currently set at 0.25) by
> minimizing the sum of squares of the relative errors of the mixing model,
> calculated as follows:
>
> SSRE =( (x1-((a1*c1)+(a2*c2)+(a3*c3)+(a4*c4))/x1)^2) +
> (x2-((b1*c1)+(b2*c2)+(b3*c3)+(b4*c4))/x2)^2)
>
I do not understand why you are dividing by x1 and x2. It make no sense to me to calculate (xi- (xi_calculated)/xi)^2
Given what your stated purpose then (xi- xi_calculated)^2 or something like (1-x1/xi_calculated)^2 seem more appropriate.
> Where:
> x1 = calcium in river;
> x2 = magnesium in river;
> a1:a4 = Ca in topsoil, channel banks, roads, drains
> b1:b4 = Mg in topsoil, channel banks, roads, drains
> c1:4 = Contribution to be optimized
>
> I can generate a solution very quickly using the MS Excel Solver function,
> however later I want to be able to run a Monte-Carlo simulation on to
> generate confidence intervals based on variance in the source area
> concentrations – hence my desire to use R to develop the mixing model.
>
> So far I have been using the ‘optim’ function, however I’m confused as to
> what form the ‘par’ and ‘fn’ arguments should take from my data above. I am
> also unsure of how to write the model constraints – i.e. total contribution
> should sum to 1, and all values must be non-negative.
>
The 'par' should be a vector of starting values of the parameters for your objective function.
The 'fn' is the function that calculates a scalar given the parameter values.
Your 'par' is a vector with all elements between 0 and 1 and with a sum == 1.
That can't be done with optim but you can simulate the requirements by letting optim work with a three element vector and defining the fourth value as
1-sum(first three params). The requirement that all params must lie between 0 and 1 can be met by making 'fn' return a large value when the requirement is not met.
Some code:
SSRE <- function(parx) {
par<- c(parx,1-sum(parx))
if(all(par > 0 & par < 1)) { # parameters meet requirements
sum((target - (bmat %*% par))^2) # this is a linear algebra version of your objective without the division by xi
# or if you want to divide by target (see above) see below in the benchmark section for comment
# sum(((target - (bmat %*% par))/target)^2)
} else 1e7 # penalty for parameters not satisfying constraints
}
SSRE(pstart)
z <- optim(pstart,SSRE)
z
c(z$par, 1-sum(z$par)) # final contributions
Results:
# > z <- optim(pstart,SSRE)
# > z
# $par
# [1] 0.1492113 0.2880078 0.2950191
#
# $value
# [1] 2.157446e-12
#
# $counts
# function gradient
# 108 NA
#
# $convergence
# [1] 0
#
# $message
# NULL
You can also try this:
z <- optim(pstart,SSRE,method="BFGS")
z
c(z$par, 1-sum(z$par))
z <- nlminb(pstart,SSRE)
z
c(z$par, 1-sum(z$par))
If you intend to do run these optimizations many times I wouldn't use optim without specifying the method argument.
See this benchmark.
> library(rbenchmark)
> benchmark(optim(pstart,SSRE),optim(pstart,SSRE,method="BFGS"), nlminb(pstart,SSRE),
+ replications=1000, columns=c("test","replications","elapsed","relative"))
test replications elapsed relative
3 nlminb(pstart, SSRE) 1000 0.829 1.264
1 optim(pstart, SSRE) 1000 1.647 2.511
2 optim(pstart, SSRE, method = "BFGS") 1000 0.656 1.000
If you use sum(((target - (bmat %*% par))/target)^2) as objective you'll see different timings. nlminb turns out to be the fastest.
good luck,
Berend
> Any help, advise, or suggestions to this problem would be very much
> appreciated.
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Optimization-in-R-similar-to-MS-Excel-Solver-tp4646759.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list