[Rd] combn(n, k, ...) and all its re-inventions
Martin Maechler
maechler at stat.math.ethz.ch
Tue May 9 11:10:38 CEST 2006
It seems people are reinventing the wheel here:
The goal is to generate all combinations of 1:n of size k.
This (typically) results in a matrix of size k * choose(n,k)
i.e. needs O(n ^ k) space, hence is only applicable to
relatively small k.
Then alternatives have been devised to generate the combinations
"one by one", and I think I remember there has been a
quiz/challenge about 20 years ago, at an S user's conference in
Australia(?), on this topic.
Anyway, AFAIK, the first nice and efficient function for this
has been provided by Scott Chasalow for S (not R) and he made
his S code available at the University of Wageningen as
"combinat" module. Later this became into an R package which is
now maintained by Vince Carey. The source of 'combinat' still
shows
# DATE WRITTEN: 14 April 1994 LAST REVISED: 10 July 1995
# AUTHOR: Scott Chasalow
OTOH, people have since reinvented the wheel quite prolifically:
There's combinations() in gtools {based on Bill Venables' code from R News 1/1},
combs() in CAtools, subsets() in BHH2, and nchoosek() in vsn (bioconductor);
then 'fwd.combn' in package "forward" which states explicitly
that it is Scott's combn() renamed..
I stopped searching for more, and I've made sure all
these 6 functions compute the same thing, at least in the most simple case.
After simply replacing nCm() by choose(), and some other minor
tweaks, I have now a version of combn() that is faster than all
the other implementations {only slightly faster than
combinations()}, and I plan to add this to R's standard package
'utils'.
Hopefully, the reinventing can be stopped by this, once people
can rely on a relatively fast implementation of the
functionality.
One might also consider to include a version of the ``one by
one'' combination generators {as mentioned above} which is
needed for larger k.
Opinions ?
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list