[R] Generating unordered, with replacement, samples
Giovanni Petris
gpetris at uark.edu
Wed Sep 17 21:46:32 CEST 2014
Hi Duncan,
You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
> n <- 6; k <- 4
> set.seed(2)
> xxx <- rep("*", n + k)
> ind <- sort(sample(2 : (n+k), k))
> xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> noquote(xxx)
[1] a b * c d * * e f *
This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...
Thank you,
Giovanni
________________________________________
From: Duncan Murdoch [murdoch.duncan at gmail.com]
Sent: Wednesday, September 17, 2014 14:07
To: Giovanni Petris; r-help at R-project.org
Subject: Re: [R] Generating unordered, with replacement, samples
On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k-1. The sample() function in R can easily give you a sample of k
integers from 1:(n+k-1); "all" you need to do is map those numbers into
your original sample of k from n. For that you need to remember the
derivation of that formula!
Duncan Murdoch
More information about the R-help
mailing list