[R] Drawing a sample based on certain condition

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Apr 15 13:11:47 CEST 2025


Às 14:04 de 14/04/2025, Rui Barradas escreveu:
> Às 12:26 de 14/04/2025, Brian Smith escreveu:
>> Hi,
>>
>> For my analytical work, I need to draw a sample of certain sample size
>> from a denied population, where population members are marked by
>> non-negative integers, such that sum of sample members if fixed. For
>> example,
>>
>> Population = 0:100
>> Sample_size = 10
>> Sample_Sum = 20
>>
>> Under this setup if my sample members are X1, X2, ..., X10 then I
>> should have X1+X2+...+X10 = 20
>>
>> Sample drawing scheme may be with/without replacement
>>
>> Is there any R function to achieve this? One possibility is to employ
>> naive trial-error approach, but this doesnt seem to be practical as it
>> would take long time to get the final sample with desired properties.
>>
>> Any pointer would be greatly appreciated.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting- 
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
> Use the multinomial distribution. The problem with this is that the 
> population is not part of the solution. But the sum is Sample_Sum.
> 
> 
> Population <- 0:100
> Sample_size <- 10
> Sample_Sum <- 20
> 
> probs <- rep(1/Sample_size, Sample_size)
> 
> # one sample
> rmultinom(1, Sample_Sum, prob = probs)
> # five samples
> rmultinom(5, Sample_Sum, prob = probs)
> 
> 
> Another way, found on StackOverflow.
> 
> 
> # user: https://stackoverflow.com/users/4996248/john-coleman
> # answer: https://stackoverflow.com/a/49016614/8245406
> rand.nums <- function(a,b,n,k){
>    #finds n random integers in range a:b which sum to k
>    while(TRUE){
>      x <- sample(1:(k - n*a), n - 1, replace = TRUE) #cutpoints
>      x <- sort(x)
>      x <- c(x, k - n*a) - c(0, x)
>      if(max(x) <= b - a) return(a + x)
>    }
> }
> 
> rand.nums(0, 100, Sample_size, Sample_Sum)
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
Hello,

If you have a vector of ages and want to draw n elements from it adding 
up to a fixed value, the following code draws without replacement.

Note that it only works if the ages vector is small. Moderately large 
vectors will break ?combn.



set.seed(2025)
age_vec_size <- 20L
age <- sample.int(100L, age_vec_size, TRUE)
target_age_sum <- 400L
n <- 10L

cmb <- combn(age, n)
i <- colSums(cmb) == target_age_sum
cmb[, i] |> t() |> head()
cmb[, i] |> t() |> tail()

cmb[, i] |> table() |> barplot()



Hope this helps,

Rui Barradas


-- 
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com



More information about the R-help mailing list