[Rd] issue with sample in R 3.6.0.

Tierney, Luke |uke-t|erney @end|ng |rom u|ow@@edu
Fri Mar 1 21:39:29 CET 2019


Thanks; fixed now in R-devel. Not an issue on Linux/Mac OS but is on
64-bit Windows, where sizeof(long) is 4.

Best,

luke

On Fri, 1 Mar 2019, Joseph Wood wrote:

> Hello,
>
> I think there is an issue in the sampling rejection algorithm in R 3.6.0.
>
> The do_sample2 function in src/main/unique.c still has 4.5e15 as an
> upper limit, implying that numbers greater than INT_MAX are still to
> be supported by sample in base R.
>
> Please review the examples below:
>
> set.seed(123)
> max(sample(2^31, 1e5))
> [1] 2147430096
>
> set.seed(123)
> max(sample(2^31 + 1, 1e5))
> [1] 1
>
> set.seed(123)
> max(sample(2^32, 1e5))
> [1] 1
>
> set.seed(123)
> max(sample(2^35, 1e5))
> [1] 8
>
> set.seed(123)
> max(sample(2^38, 1e5))
> [1] 64
>
> set.seed(123)
> max(sample(2^38, 1e5))
> [1] 64
>
> set.seed(123)
> max(sample(2^42, 1e5))
> [1] 1024
>
> From the above, we see that if N is greater than 2^31, then N is
> bounded by (2^(ceiling(log2(N)) – 32)).
>
> Looking at the source code to src/main/RNG.c, we have the following:
>
> static double rbits(int bits)
> {
>    int_least64_t v = 0;
>    for (int n = 0; n <= bits; n += 16) {
>                int v1 = (int) floor(unif_rand() * 65536);
>                v = 65536 * v + v1;
>    }
>    // mask out the bits in the result that are not needed
>    return (double) (v & ((1L << bits) - 1));
> }
>
> The last line has (v & ((1L << bits) - 1)) where v is declared as
> int_least64_t. If you notice, we are operating on v with the long
> integer literal 1L. I’m pretty sure this is the source of the issue.
> By changing 1L to at least a 64 bit integer, it appears that we
> correct the problem:
>
> double rbits(int bits)
> {
>    int_least64_t v = 0;
>    for (int n = 0; n <= bits; n += 16) {
>                int v1 = (int) floor(unif_rand() * 65536);
>                v = 65536 * v + v1;
>    }
>
>    int_least64_t one64 = 1L;
>    // mask out the bits in the result that are not needed
>    return (double) (v & ((one64 << bits) - 1));
> }
>
> Regards,
> Joseph Wood
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list