[R] Quickly generate all possible combinations of 2 groups
Mike Lawrence
Mike.Lawrence at dal.ca
Mon Nov 9 13:42:51 CET 2009
Hi all,
I suspect the answer to this query will be the tongue-in-cheek "use a
quantum computer", but I thought my understanding might be
sufficiently limited that I'm missing a simpler option.
I'm looking for a way to cycle through all possible combinations of 2
groups of data. For 10 data points, that's 2^10 combinations, for 20
data points it's 2^20, etc. combn() from the combinat package is
useful to this purpose, as in:
########
library(combinat)
n=20 #number of data points in the data set
start=proc.time()[3] #start a timer
pb = txtProgressBar(max=n+1,style=3) #initialize a progress bar
for(i in 0:n){ #for each possible size of set1
Set1 = combn(1:n,i) #get set1 combinations
Set2 = rev(combn(1:n,n-i)) #get set2 combinations
setTxtProgressBar(pb,i+1) #increment the progress bar
}
close(pb) #close the progress bar
proc.time()[3]-start #show the time taken (about 40s @ 2.2GHz)
########
However, this obviously ends up being too slow when the number of data
points rises much above 20 (I'll likely be dealing with data sets to a
maximum of 200 points).
In case it's relevant, the motivation behind this problem is that I'm
seeking an alternative to EM or simplex methods to obtaining the MLE
of mixture data. Given a mixture model consisting of 2 distributions,
I should be able to obtain an MLE by finding the partitioning of the
data into 2 groups that yields the highest likelihood. I'm
specifically looking at modelling circular data by a mixture of
uniform and a Von Mises centered on zero, so once I have a given
partition, I can estimate parameters of the model (proportion of
points drawn from the Von Mises and concentration of the Von Mises)
analytically and compute the likelihood of the data given that pair of
parameters. I've coded a variant of this approach that generates
random partitioning of data, and this seems to to a decent job of
generating something that might be useful as a starting point for a
subsequent EM or simplex search, but I thought I might double check
with the list to see if there's a computationally efficient solution
to the "test all combinations" scheme.
Cheers,
Mike
--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar
~ Certainty is folly... I think. ~
More information about the R-help
mailing list