[R] permutation cycles

Tue Aug 26 17:55:10 CEST 2008

I'm replying in public to this,
since it relates to an R-help thread of April 15-16, 2008
(and am also adding back 'References:' to one of those postings
 of mine).

>>>>> Mark Segal < mark at biostat ucsf >
>>>>>     on Mon, 25 Aug 2008 11:25:05 -0700 writes:

    > Hi Martin,
    > You posted the following code for obtaining cycles of a permutation:

	## MM: Peter's function is so compact and elegant,
	## now try to optimize for speed() :
	cycLengths2 <- function(p) {
	     n <- length(p)
	     x <- integer(n)
	     ii <- seq_len(n)
	     for (i in ii) {
		 z <- ii[!x][1] # index of first unmarked x[] entry
		 if (is.na(z)) break
		 repeat { ## mark x[] <- i for those in i-th cycle
		     x[z] <- i
		     z <- p[z]
		     if (x[z]) break
		 }
	     }
	     ## Now, table(x) gives the cycle lengths,
	     ## where split(seq_len(n), x) would give the cycles list
	     tabulate(x, i - 1L) ## is quite a bit faster than the equivalent
	     ## table(x)
	}

    > But, while it is the case that
    > split(seq_len(n), x)
    > gives the members of the respective cycle lists, their cycle ordering
    > is destroyed: it is replaced with increasing order.
    > Do you have an efficient way of obtaining (preserved) cycle order?

Yes, indeed, the above code  (a version of an original function
by Peter Dalgaard) does not very easily lend itself to get the
cycles "in order".
Well, as a matter of fact, I've spent too much time achieving
that but the resulting solution was slower than
the function that  Barry Rowlingson had sent me back in April
which exactly solves the problem of finding the 
"cycle decomposition" of permutation vector p[]:

cycles <- function(p) {

    ## From: Barry Rowlingson < at Lancaster ac UK>
    ## To: Martin Maechler < at ETH Zurich>
    ## Subject: Re: [R] sign(<permutation>) in R ?
    ## Date: Tue, 15 Apr 2008 18:15:19 +0100

    ## with minor tweaks by MM
    cycles <- list()
    ii <- seq_along(p)
    while(!all(is.na(p))) {
        start <- ii[!is.na(p)][1]# == which(!is.na(p))[1]
        cycle <- start
        i <- start
        repeat {
            nextV <- p[i]
            p[i] <- NA
            if(nextV == start) # cycle is closed
                break
            ## else
            cycle <- c(cycle,nextV)
            i <- nextV
        }
        cycles <- c(cycles,list(cycle))
    }
    return(cycles)
}

You can check that it is working
(and also find that it's reasonably fast),
e.g., with

> p <- sample(12); rbind(seq_along(p), p)

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
     1    2    3    4    5    6    7    8    9    10    11    12
p   11    4    5    3    6    2    9   12    1     8    10     7

> str(cycles(p))
List of 2
 $ : int [1:7] 1 11 10 8 12 7 9
 $ : int [1:5] 2 4 3 5 6

---
Martin Maechler, ETH Zurich