[R] Solving an optimization problem: selecting an "optimal" subset
Hans W Borchers
hwborchers at googlemail.com
Mon Feb 1 17:04:37 CET 2010
Dimitri Shvorob <dimitri.shvorob <at> gmail.com> writes:
>
> Given vector of numbers x, I wish to select an n-subset with sum closest
> fixed value s. Can anyone advise me how to approach this, in R?
>
> I have considered Rcplex package, which handles integer/binary
> linear/quadratic optimization problems, but have difficulty setting up
> the quadratic form for [sum(x) - s]^2.
>
> (Dynamic programming over [0, sum(x)]? A genetic algorithm? Can anyone
> contribute a binary GA optimization sample?)
>
Here is a solution completely done with R and the MILP solver Symphony
provided in the 'Rsymphony' package (but 'Rglpk' will work the same). It
finds subsets of size <= k in a set s of size n.
The "absolute deviation" from the target value z is simulated by two
additional continuous (a_p and a_m) and and one binary variable d0.
To force positivity, the common Big-M trick is employed.
Hans Werner
----
library(Rsymphony)
n <- 100; k <- 20 # Find k-subset in set of size n
z <- 2.0 # whose elements sum up to z
set.seed(8232); s <- runif(n) # reference example
M <- sum(s) + z # for the Big-M trick
# Prepare inputs for MILP solver
obj <- c(rep(0, n), 0, 1, 1, 0)
typ <- c(rep("B", n), "B", "C", "C", "B")
mat <- matrix(c( s, -z, -1, 1, 0, # a = a_p + a_m
rep(0, n), 1, 0, 0, 0, # constant term
rep(0, n), 0, 1, 0, -M, # a_p <= M * d0
rep(0, n), 0, 0, 1, M, # a_m <= M * (d0-1)
rep(1, n), 0, 0, 0, 0), # subset size <= k
nrow=5, byrow=T)
dir <- c("==", "==", "<=", "<=", "<=")
rhs <- c(0, 1, 0, M, k)
max <- FALSE
# Apply Sypmphony as our MILP solver
sol <- Rsymphony_solve_LP(obj, mat, dir, rhs, types=typ, max = max)
x <- sol$solution[1:n] # Solution vector describing k-subset
( i <- (1:n)[x==1] ) #=> [1] 18 25 44 81 95
print(sum(s[i]), digits=10) #=> [1] 2.000000205
----
More information about the R-help
mailing list