[R] A combinatorial optimization problem: finding the best permutation of a complex vector
Charles C. Berry
cberry at tajo.ucsd.edu
Thu Nov 12 20:20:27 CET 2009
On Thu, 12 Nov 2009, Ravi Varadhan wrote:
>
> Hi,
>
> I have a complex-valued vector X in C^n. Given another complex-valued
> vector Y in C^n, I want to find a permutation of Y, say, Y*, that
> minimizes ||X - Y*||, the distance between X and Y*.
>
> Note that this problem can be trivially solved for "Real" vectors, since
> real numbers possess the ordering property. Complex numbers, however, do
> not possess this property. Hence the challenge.
>
> The obvious solution is to enumerate all the permutations of Y and pick
> out the one that yields the smallest distance. This, however, is only
> feasible for small n. I would like to be able to solve this for n as
> large as 100 - 1000, in which cases the permutation approach is
> infeasible.
>
> I am looking for algorithms, possibly iterative, that can provide a
> "good" approximate solutions that are not necessarily optimal for
> high-dimensional vectors. I can do random sampling, but this can be very
> inefficient in high-dimensional problems. I am looking for efficient
> algorithms because this step has to be performed in each iteration of an
> "outer" algorithm.
>
> Are there any clever adaptive algorithms out there?
>
I do not know.
But would you settle for a not-so-clever adaptive heuristic?
If so, see below.
> Here is an example illustrating the problem:
>
> require(e1071)
>
> n <- 8
> x <- runif(n) + 1i * runif(n)
> y <- runif(n) + 1i * runif(n)
>
> dist <- function(x, y) sqrt(sum(Mod(x - y)^2))
>
> perms <- permutations(n)
> dim(perms) # [1] 40320 8
> tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> z <- y[perms[which.min(tmp), ]] # exact solution
> dist(x, z)
>
> # an aproximate random-sampling approach
> nsamp <- 10000
> perms <- t(replicate(nsamp, sample(1:n, size=n, replace=FALSE)))
> tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> z.app <- y[perms[which.min(tmp), ]] # approximate solution
> dist(x, z.app)
>
The heuristic is to use a stochastic greedy updates. Here is a very simple
one:
swap.samp <-
function(index) {
sub.ind <- sample(seq(along=index),2)
index[sub.ind]<- rev(sub.ind)
index
}
z.app <- y
z.cand <- y
for (i in 1:100) z.cand <-
if( dist(x,z.app) < dist(x,z.cand) ) {
z.app[swap.samp(1:8)]
} else {
z.app <- z.cand
z.cand[swap.samp(1:8)]
}
On your toy example, this usually finds the min(dist(x,z.app)) in < 100
trials.
Note that when
z.diff <- z.app != z.cand
dist(x[ z.diff ],z.app[ z.diff ])^2 - dist(x[ z.diff ],z.cand[ z.diff ])^2
equals
dist(x,z.app)^2 - dist(x,z.cand)^2
so you could vectorize the above to randomly pair up all the points (for n
even) then update the n%/%2 pairs all at once.
For large problems, you might try to preferentially sample pairs of points
with similar values of Mod ( x - z.app ) to begin with. The heuristic
being that in two pairs of points that are both distant, swapping them
might realize a bigger benefit.
--
Another version is to sample k points, randomly permute, then do a greedy
update:
sub.samp <-
function(index,n) {
sub.ind <- sample(seq(along=index),n)
index[sub.ind]<- sample(sub.ind)
index
}
# k is 4 here
for (i in 1:100) z.cand <-
if( dist(x,z.app) < dist(x,z.cand) ){
z.app[sub.samp(1:8,4)]
} else {
z.app <- z.cand
z.cand[sub.samp(1:8,4)]
}
On your toy problem, this takes in the low 100's to find the min.
I think you can do the similar vectorization trick here.
--
I suppose another variant would repeatedly sample k points, then enumerate
distances for all permutations of them, then choose the best one to
update z.app.
Again, in larger problems, one might weight those k points towards higher
values of Mod( x - z.app ) at least to begin with.
HTH,
Chuck
> Thanks for any help.
>
> Best,
> Ravi.
>
>
> ____________________________________________________________________
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvaradhan at jhmi.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list