[R] Sampling from a population

Ravi Varadhan rvaradha at jhsph.edu
Wed Nov 21 00:51:27 CET 2001


There is a recursive function called "subsets" in the S-Programming book by V&R, that lets you generate all unique combinations of n objects taken k at a time.  Tim Hesterberg also has a non-recursive function called "combinations", which may also be used.  Note that the recursive function uses a lot more memory. You can use any of these functions to generate unique, exhaustive samples.  

Hope this helps,
Ravi.

Here is the non-recursive function (courtesy: Tim Hesterberg):

combinations _ function(n, k){
  # Compute all n choose k combinations of size k from 1:n
  # Return matrix with k rows and choose(n,k) columns.
  # Avoids recursion.
  if(!is.numeric(n) || length(n) != 1 || n%%1) stop("'n' must be an integer")
  if(!is.numeric(k) || length(k) != 1 || k%%1) stop("'k' must be an integer")
  if(k > n || k <= 0) return(numeric(0))
  rowMatrix _ function(n) structure(1:n, dim=c(1,n))
  colMatrix _ function(n) structure(1:n, dim=c(n,1))
  if(k == n) return(colMatrix(n))
  if(k == 1) return(rowMatrix(n))
  L _ vector("list", k)
  # L[[j]] will contain combinations(N, j) for N = 2:n
  L[[1]] _ rowMatrix(2)
  L[[2]] _ colMatrix(2)
  Diff _ n-k
  for(N in seq(3, n, by=1)){
    # loop over j in reverse order, to avoid overwriting
    for(j in seq(min(k, N-1), max(2, N-Diff), by= -1))
      L[[j]] _ cbind(L[[j]], rbind(L[[j-1]], N, deparse.level=0))
    if(N <= Diff+1) L[[1]] _ rowMatrix(N)
    else L[[N-(Diff+1)]] _ numeric(0)
    if(N <= k) L[[N]] _ colMatrix(N)
  }
  L[[k]]
}


    -----Original Message-----
    From: Andrew Criswell <arc at arcriswell.com>
    To: r-help at stat.math.ethz.ch <r-help at stat.math.ethz.ch>
    Date: Tuesday, November 20, 2001 12:52 AM
    Subject: [R] Sampling from a population
    
    
    Hi ALL:
    
    Suppose you have a population of N <- 5 observations, x <- c(43, 28, 7, 61, 39). From that you can draw a maximum of 10 samples without replacement of size n <- 3. (Command choose(N,n) yields 10). For instance the samples I seek are
    
        43, 61, 7
        39, 7, 28  ...etc
    
    How can I get R to do that for me, to get an exhaustive list of samples of size n drawn without replacement from a population of size N?  The command, sample(x, 3, replace=FALSE), works well for one draw. Is there a package that will handle multiple draws?
    
    Thanks and best wishes,
    ANDREW
    
        
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20011120/f1e9cc8a/attachment.html


More information about the R-help mailing list