[R] Generating groupings of ordered observations
Charles C. Berry
cberry at tajo.ucsd.edu
Sat Jun 21 20:30:33 CEST 2008
On Sat, 21 Jun 2008, Gavin Simpson wrote:
> Dear List,
>
> I have a problem I'm finding it difficult to make headway with.
>
> Say I have 6 ordered observations, and I want to find all combinations
> of splitting these 6 ordered observations in g groups, where g = 1, ...,
> 6. Groups can only be formed by adjacent observations, so observations 1
> and 4 can't be in a group on their own, only if 1,2,3&4 are all in the
> group.
>
Right. And in the example below there are 32 distinct patterns.
Which arises from sum( choose( 5, 0:5 ) ) different placements of 0:5
split positions.
You can represent the splits as a binary number with n-1 bits: 00000
implies no splits, 10000 implies a split between 1 and 2, 10100 implies
splits between 1 and 2 and between 3 and 4, et cetera.
So, 32 arises as 2^5, too.
Something like this:
> base10 <- seq(0, length=2^(n-1) )
> base2.bits <- outer(0:(n-2), base10, function(y,x) ( x %/% (2^y)) %%2 )
> sapply(apply( base2.bits==1, 2, which ), function(x) rep(1:(1+length(x)), diff(c(0,x,n))))
Getting this in the same column order as your example is left as an
exercise for the reader.
HTH,
Chuck
> For example, with 6 observations, the columns of the matrices below
> represent the groups that can be formed by placing the 6 ordered
> observations into 2-5 groups. Think of the columns of these matrices as
> being an indicator of group membership. We then cbind these matrices
> with the trivial partitions into 1 and 6 groups:
>
> mat2g <- matrix(c(1,1,1,1,1,
> 2,1,1,1,1,
> 2,2,1,1,1,
> 2,2,2,1,1,
> 2,2,2,2,1,
> 2,2,2,2,2),
> nrow = 6, ncol = 5, byrow = TRUE)
>
> mat3g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
> 2,2,2,2,1,1,1,1,1,1,
> 3,2,2,2,2,2,2,1,1,1,
> 3,3,2,2,3,2,2,2,2,1,
> 3,3,3,2,3,3,2,3,2,2,
> 3,3,3,3,3,3,3,3,3,3),
> nrow = 6, ncol = 10, byrow = TRUE)
>
> mat4g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
> 2,2,2,2,2,2,1,1,1,1,
> 3,3,3,2,2,2,2,2,2,1,
> 4,3,3,3,3,2,3,3,2,2,
> 4,4,3,4,3,3,4,3,3,3,
> 4,4,4,4,4,4,4,4,4,4),
> nrow = 6, ncol = 10, byrow = TRUE)
>
> mat5g <- matrix(c(1,1,1,1,1,
> 2,2,2,2,1,
> 3,3,3,2,2,
> 4,4,3,3,3,
> 5,4,4,4,4,
> 5,5,5,5,5),
> nrow = 6, ncol = 5, byrow = TRUE)
>
> cbind(rep(1,6), mat2g, mat3g, mat4g, mat5g, 1:6)
>
> I'd like to be able to do this automagically, for any (reasonable,
> small, say n = 10-20) number of observations, n, and for g = 1, ..., n
> groups.
>
> I can't see the pattern here or a way forward. Can anyone suggest an
> approach?
>
> Thanks in advance,
>
> Gavin
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list