[R] sample(c(0, 1)...) vs. rbinom
Albyn Jones
jones at reed.edu
Thu May 23 17:29:55 CEST 2013
After a bit of playing around, I discovered that
sample() does something similar in other situations:
> set.seed(105021)
> sample(1:5,1,prob=c(1,1,1,1,1))
[1] 3
> set.seed(105021)
> sample(1:5,1)
[1] 2
> set.seed(105021)
> sample(1:5,5,prob=c(1,1,1,1,1))
[1] 3 4 2 1 5
> set.seed(105021)
> sample(1:5,5)
[1] 2 5 1 4 3
albyn
On 2013-05-22 22:24, peter dalgaard wrote:
> On May 23, 2013, at 07:01 , Jeff Newmiller wrote:
>
>> You seem to be building an elaborate structure for testing the
>> reproducibility of the random number generator. I suspect that rbinom
>> is calling the random number generator a different number of times
>> when you pass prob=0.5 than otherwise.
>
> Nope. It's switching 0 and 1:
>
>> set.seed(1); sample(0:1,10,replace=TRUE,prob=c(1-pp,pp));
>> set.seed(1); rbinom(10,1,pp)
> [1] 1 1 0 0 1 0 0 0 0 1
> [1] 0 0 1 1 0 1 1 1 1 0
>
> which is curious, but of course has no implication for the
> distributional properties. Curiouser, if you drop the prob= in
> sample.
>
>> set.seed(1); sample(0:1,10,replace=TRUE); set.seed(1);
>> rbinom(10,1,pp)
> [1] 0 0 1 1 0 1 1 1 1 0
> [1] 0 0 1 1 0 1 1 1 1 0
>
> However, it was never a design goal that two different random
> functions (or even two code paths within the same function) should
> give exactly the same values, even if they simulate the same
> distribution, so this is nothing more than a curiosity.
>
>
>>>
>>> Appendix A: some R code that exhibits the problem
>>> =================================================
>>>
>>> ppp <- seq(0, 1, by = 0.01)
>>>
>>> result <- do.call(rbind, lapply(ppp, function(p) {
>>> set.seed(1)
>>> sampleRes <- sample(c(0, 1), size = 1, replace = TRUE,
>>> prob=c(1-p, p))
>>>
>>> set.seed(1)
>>> rbinomRes <- rbinom(1, size = 1, prob = p)
>>>
>>> data.frame(prob = p, equivalent = all(sampleRes == rbinomRes))
>>>
>>> }))
>>>
>>> result
>>>
>>>
>>> Appendix B: the output from the R code
>>> ======================================
>>>
>>> prob equivalent
>>> 1 0.00 TRUE
>>> 2 0.01 TRUE
>>> 3 0.02 TRUE
>>> 4 0.03 TRUE
>>> 5 0.04 TRUE
>>> 6 0.05 TRUE
>>> 7 0.06 TRUE
>>> 8 0.07 TRUE
>>> 9 0.08 TRUE
>>> 10 0.09 TRUE
>>> 11 0.10 TRUE
>>> 12 0.11 TRUE
>>> 13 0.12 TRUE
>>> 14 0.13 TRUE
>>> 15 0.14 TRUE
>>> 16 0.15 TRUE
>>> 17 0.16 TRUE
>>> 18 0.17 TRUE
>>> 19 0.18 TRUE
>>> 20 0.19 TRUE
>>> 21 0.20 TRUE
>>> 22 0.21 TRUE
>>> 23 0.22 TRUE
>>> 24 0.23 TRUE
>>> 25 0.24 TRUE
>>> 26 0.25 TRUE
>>> 27 0.26 TRUE
>>> 28 0.27 TRUE
>>> 29 0.28 TRUE
>>> 30 0.29 TRUE
>>> 31 0.30 TRUE
>>> 32 0.31 TRUE
>>> 33 0.32 TRUE
>>> 34 0.33 TRUE
>>> 35 0.34 TRUE
>>> 36 0.35 TRUE
>>> 37 0.36 TRUE
>>> 38 0.37 TRUE
>>> 39 0.38 TRUE
>>> 40 0.39 TRUE
>>> 41 0.40 TRUE
>>> 42 0.41 TRUE
>>> 43 0.42 TRUE
>>> 44 0.43 TRUE
>>> 45 0.44 TRUE
>>> 46 0.45 TRUE
>>> 47 0.46 TRUE
>>> 48 0.47 TRUE
>>> 49 0.48 TRUE
>>> 50 0.49 TRUE
>>> 51 0.50 FALSE
>>> 52 0.51 TRUE
>>> 53 0.52 TRUE
>>> 54 0.53 TRUE
>>> 55 0.54 TRUE
>>> 56 0.55 TRUE
>>> 57 0.56 TRUE
>>> 58 0.57 TRUE
>>> 59 0.58 TRUE
>>> 60 0.59 TRUE
>>> 61 0.60 TRUE
>>> 62 0.61 TRUE
>>> 63 0.62 TRUE
>>> 64 0.63 TRUE
>>> 65 0.64 TRUE
>>> 66 0.65 TRUE
>>> 67 0.66 TRUE
>>> 68 0.67 TRUE
>>> 69 0.68 TRUE
>>> 70 0.69 TRUE
>>> 71 0.70 TRUE
>>> 72 0.71 TRUE
>>> 73 0.72 TRUE
>>> 74 0.73 TRUE
>>> 75 0.74 TRUE
>>> 76 0.75 TRUE
>>> 77 0.76 TRUE
>>> 78 0.77 TRUE
>>> 79 0.78 TRUE
>>> 80 0.79 TRUE
>>> 81 0.80 TRUE
>>> 82 0.81 TRUE
>>> 83 0.82 TRUE
>>> 84 0.83 TRUE
>>> 85 0.84 TRUE
>>> 86 0.85 TRUE
>>> 87 0.86 TRUE
>>> 88 0.87 TRUE
>>> 89 0.88 TRUE
>>> 90 0.89 TRUE
>>> 91 0.90 TRUE
>>> 92 0.91 TRUE
>>> 93 0.92 TRUE
>>> 94 0.93 TRUE
>>> 95 0.94 TRUE
>>> 96 0.95 TRUE
>>> 97 0.96 TRUE
>>> 98 0.97 TRUE
>>> 99 0.98 TRUE
>>> 100 0.99 TRUE
>>> 101 1.00 TRUE
>>>
>>> Appendix C: Session information
>>> ===============================
>>>
>>>> sessionInfo()
>>> R version 3.0.0 (2013-04-03)
>>> Platform: x86_64-redhat-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=C LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods
>>> base
>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list