[Rd] [R] RNG Cycle and Duplication (PR#12540)
shli at stat.wvu.edu
shli at stat.wvu.edu
Thu Aug 14 23:45:09 CEST 2008
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
I didn't describe the problem clearly. It's about the number of distinct=20
values. So just ignore cycle issue.
My tests were:
sum(duplicated(runif(1e7))); #return 46552
sum(duplicated(runif(1e7))); #return 46415
#These collision frequency suggested there were 2^30 distinct values by=20
sum(duplicated(runif(1e7))); #return 11682
sum(duplicated(runif(1e7))); #return 11542
sum(duplicated(runif(1e7))); #return 11656
#These indicated there were 2^32 distinct values, which agrees with the=20
sum(duplicated(runif(1e7))); #return 0
#So for this method, there should be more than 2^32 distinct values.
You may not get the exact numbers, but they should be close. So how to=20
explain above problem?
I need generate a large sample without any ties, it seems to me=20
"Wichmann-Hill" is only choice right now.
The Department of Statistics
PO Box 6330
West Virginia University
Morgantown, WV 26506-6330
On Thu, 14 Aug 2008, Peter Dalgaard wrote:
> Shengqiao Li wrote:
>> Hello all,
>> I am generating large samples of random numbers. The RNG help page says:=
>> "All the supplied uniform generators return 32-bit integer values that a=
>> converted to doubles, so they take at most 2^32 distinct values and long=
>> runs will return duplicated values." But I find that the cycles are not =
>> same as the 32-bit integer.
>> My test indicated that the cycles for Knuth's methods were 2^30 while=20
>> Wichmann-Hill's cycle was larger than 2^32! No numbers were duplicated i=
>> 10M numbers generated by runif using Wichmann-Hill. The other three meth=
>> had cycle length of 2^32.
>> So, anybody can explain this? And any improvement to the implementation =
>> be made to increase the cycle length like the Wichmann-Hill method?
> What test? These are not simple linear congruential generators. Just beca=
> you get the same value twice, it doesn't mean that the sequence is repeat=
> Perhaps you should read the entire help page rather than just the note.
> O__ ---- Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel