[R] A combinatorial optimization problem: finding the best permutation of a complex vector

Thu Nov 12 20:20:27 CET 2009

On Thu, 12 Nov 2009, Ravi Varadhan wrote:

>
> Hi,
>
> I have a complex-valued vector X in C^n.  Given another complex-valued 
> vector Y in C^n, I want to find a permutation of Y, say, Y*, that 
> minimizes ||X - Y*||, the distance between X and Y*.
>
> Note that this problem can be trivially solved for "Real" vectors, since 
> real numbers possess the ordering property. Complex numbers, however, do 
> not possess this property.  Hence the challenge.
>
> The obvious solution is to enumerate all the permutations of Y and pick 
> out the one that yields the smallest distance.  This, however, is only 
> feasible for small n.  I would like to be able to solve this for n as 
> large as 100 - 1000, in which cases the permutation approach is 
> infeasible.
>
> I am looking for algorithms, possibly iterative, that can provide a 
> "good" approximate solutions that are not necessarily optimal for 
> high-dimensional vectors. I can do random sampling, but this can be very 
> inefficient in high-dimensional problems.  I am looking for efficient 
> algorithms because this step has to be performed in each iteration of an 
> "outer" algorithm.
>
> Are there any clever adaptive algorithms out there?
>

I do not know.

But would you settle for a not-so-clever adaptive heuristic?

If so, see below.

> Here is an example illustrating the problem:
>
> require(e1071)
>
> n <- 8
> x <- runif(n) + 1i * runif(n)
> y <- runif(n) + 1i * runif(n)
>
> dist <- function(x, y) sqrt(sum(Mod(x - y)^2))
>
> perms <- permutations(n)
> dim(perms)  # [1] 40320     8
> tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> z <- y[perms[which.min(tmp), ]]  # exact solution
> dist(x, z)
>
> # an aproximate random-sampling approach
> nsamp <- 10000
> perms <- t(replicate(nsamp, sample(1:n, size=n, replace=FALSE)))
> tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> z.app <- y[perms[which.min(tmp), ]]  # approximate solution
> dist(x, z.app)
>

The heuristic is to use a stochastic greedy updates. Here is a very simple 
one:

swap.samp <-
  function(index) {
         sub.ind <- sample(seq(along=index),2)
         index[sub.ind]<- rev(sub.ind)
         index
  }

z.app <- y
z.cand <- y

for (i in 1:100) z.cand <-
 	if( dist(x,z.app) < dist(x,z.cand) ) {

 		z.app[swap.samp(1:8)]

 	} else {
 	z.app <- z.cand
 	z.cand[swap.samp(1:8)]
 	}

On your toy example, this usually finds the min(dist(x,z.app))  in < 100 
trials.

Note that when

 	z.diff <- z.app != z.cand

 	dist(x[ z.diff ],z.app[ z.diff ])^2 - dist(x[ z.diff ],z.cand[ z.diff ])^2

equals

 	dist(x,z.app)^2 - dist(x,z.cand)^2

so you could vectorize the above to randomly pair up all the points (for n 
even) then update the n%/%2 pairs all at once.

For large problems, you might try to preferentially sample pairs of points 
with similar values of Mod ( x - z.app ) to begin with. The heuristic 
being that in two pairs of points that are both distant, swapping them 
might realize a bigger benefit.

--

Another version is to sample k points, randomly permute, then do a greedy 
update:

sub.samp <-
  function(index,n) {
 	sub.ind <- sample(seq(along=index),n)
 	index[sub.ind]<- sample(sub.ind)
 	index
}

# k is 4 here

for (i in 1:100) z.cand <-
 	if( dist(x,z.app) < dist(x,z.cand) ){
 		z.app[sub.samp(1:8,4)]
 	} else {
 		z.app <- z.cand
 		z.cand[sub.samp(1:8,4)]
 	}

On your toy problem, this takes in the low 100's to find the min.

I think you can do the similar vectorization trick here.
--

I suppose another variant would repeatedly sample k points, then enumerate 
distances for all permutations of them, then choose the best one to 
update z.app.

Again, in larger problems, one might weight those k points towards higher 
values of Mod( x - z.app ) at least to begin with.

HTH,

Chuck

> Thanks for any help.
>
> Best,
> Ravi.
>
>
> ____________________________________________________________________
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvaradhan at jhmi.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901