[R] Question about behavior of sample.kind in set.seed (R 3.6)

Elizabeth Purdom epurdom @end|ng |rom @t@t@berke|ey@edu
Fri Apr 12 02:38:55 CEST 2019


Hello,

I am trying to update a package for the upcoming release of R, and my unit tests are affected by the change in the sample. I understand that to reproduce the old sampling, I need to set sample.kind=“Rounding” in RNGkind or set.seed. But I am confused by the behavior of the sample.kind argument in set.seed, as it doesn’t seem to change my results. 

In particular, I was trying to understand what happens if you make a call to set.seed within a function to the global environment. So I set up a test as follows:

###Test set.seed
f<-function(n,sample.kind){   #="Rounding" or "Rejection"
	cat("RNG at beginning\n")
	print(RNGkind())
	# RNGkind(sample.kind=sample.kind)
	# cat("RNG at after set\n")
	# print(RNGkind())
	set.seed(23,sample.kind=sample.kind)
	cat("RNG at after set seed\n")
	print(RNGkind())
	sample(1:400000,size=n,replace=TRUE)
}

RNGkind(sample.kind="Rejection”)
print(RNGkind())
n<-1000000
y<-f(n,"Rounding”)
print(RNGkind())
y2<-f(n,"Rejection”)
print(RNGkind())
all(y==y2)

However, it didn’t do anything:
> RNGkind(sample.kind="Rejection")
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> n<-1000000
> y<-f(n,"Rounding")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
Warning message:
In set.seed(23, sample.kind = sample.kind) :
 non-uniform 'Rounding' sampler used
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> y2<-f(n,"Rejection")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> all(y==y2)
[1] TRUE

If I run the same test with calls to RNGkind, however, it does change the method (and I discovered in answer to my question, it appears to change the global method, which is an unfortunate fact for what I am trying to do).

###Test RNGkind
f<-function(n,sample.kind){   #="Rounding" or "Rejection"
	cat("RNG at beginning\n")
	print(RNGkind())
	RNGkind(sample.kind=sample.kind)
	cat("RNG at after set\n")
	print(RNGkind())
	set.seed(23)
	cat("RNG at after set seed\n")
	print(RNGkind())
	sample(1:400000,size=n,replace=TRUE)
}

RNGkind(sample.kind="Rejection”)
print(RNGkind())
n<-1000000
y<-f(n,"Rounding”)
print(RNGkind())
y2<-f(n,"Rejection”)
print(RNGkind())
all(y==y2)

> RNGkind(sample.kind="Rejection")
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> n<-1000000
> y<-f(n,"Rounding")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
RNG at after set
[1] "Mersenne-Twister" "Inversion"        "Rounding"        
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"        "Rounding"        
Warning message:
In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler used
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rounding"        
> y2<-f(n,"Rejection")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"        "Rounding"        
RNG at after set
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"        "Rejection"       
> all(y==y2)
[1] FALSE

So clearly I should use RNGkind to change it, but what is the argument actually doing in set.seed?

Thanks,
Elizabeth Purdom

> sessionInfo()
R version 3.6.0 alpha (2019-04-09 r76363)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] BiocManager_1.30.4 compiler_3.6.0     tools_3.6.0   


More information about the R-help mailing list