[R] function for grouping

Petr Savicky savicky at cs.cas.cz
Tue Jan 24 21:23:49 CET 2012


On Tue, Jan 24, 2012 at 07:05:37PM +0100, Petr Savicky wrote:
> On Tue, Jan 24, 2012 at 05:19:42PM +0000, yan jiao wrote:
> > Dear All,
> > 
> > I'm wondering if there is a R function could give me all the 
> > combinations of the grouping/cluster result, given the number of the groups.
> > e.g.
> > 3 objects: x1 x2 x3, number of groups is 2
> > so the result will be
> > group1:x1,x2; group2: x3
> > group1: x1;group2: x2,x3
> > group1: x1,x3;group2: x2
[...]
> For more groups, a similar approach may be used.
> Eliminating equivalent groupings may be done,
> for example, by requiring that if an element is
> assigned to group i, then each of the groups
> 1, ..., i-1 is assigned to some earlier element.
> 
> So, 1, 2, 3, 2 is OK, but 1, 3, 2, 2 is not.

For example, all groupings of 5 elements into 3 groups
may be computed as follows.

  check.row <- function(x, k)
  {
      y <- unique(x)
      length(y) == k && all(y == 1:k)
  }
  
  gr <- as.matrix(expand.grid(x1=1, x2=1:2, x3=1:3, x4=1:3, x5=1:3))
  ok <- apply(gr, 1, check.row, k=3)
  gr <- gr[ok, ]
  gr
  
        x1 x2 x3 x4 x5
   [1,]  1  2  3  1  1
   [2,]  1  2  3  2  1
   [3,]  1  2  1  3  1
   [4,]  1  1  2  3  1
   [5,]  1  2  2  3  1
   [6,]  1  2  3  3  1
   [7,]  1  2  3  1  2
   [8,]  1  2  3  2  2
   [9,]  1  2  1  3  2
  [10,]  1  1  2  3  2
  [11,]  1  2  2  3  2
  [12,]  1  2  3  3  2
  [13,]  1  2  1  1  3
  [14,]  1  1  2  1  3
  [15,]  1  2  2  1  3
  [16,]  1  2  3  1  3
  [17,]  1  1  1  2  3
  [18,]  1  2  1  2  3
  [19,]  1  1  2  2  3
  [20,]  1  2  2  2  3
  [21,]  1  2  3  2  3
  [22,]  1  2  1  3  3
  [23,]  1  1  2  3  3
  [24,]  1  2  2  3  3
  [25,]  1  2  3  3  3

According to 

  http://en.wikipedia.org/wiki/Partition_of_a_set

  The number of partitions of an n-element set into exactly k nonempty
  parts is the Stirling number of the second kind S(n, k).

and according to the table at

  http://en.wikipedia.org/wiki/Stirling_number_of_the_second_kind

S(5, 3) = 25, so the above table "gr" has the correct number of rows.

Hope this helps.

Petr Savicky.



More information about the R-help mailing list