[R] Random # generator accuracy
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Jul 24 12:55:21 CEST 2009
On 23/07/2009 2:48 PM, Jim Bouldin wrote:
> Thanks Greg, that most definitely was it. So apparently the default is
> sampling without replacement. Fine, but this brings up a question I've had
> for a bit now, which is, how do you know what the default settings are for
> the arguments of any given function? The HTML help files don't seem to
> indicate in many (most) cases. Thanks.
I think you are looking in the wrong place. Most often (as for sample!)
they just list the header of the function:
sample(x, size, replace = FALSE, prob = NULL)
and the default is explicit: "replace = FALSE". Sometimes this is
repeated in the text, and sometimes it is only in the text, but there
are very few cases where a default is defined but not documented, and I
think those qualify as documentation errors that should be fixed.
Duncan Murdoch
>
>> Try adding replace=TRUE to your call to sample, then you will get numbers
>> closer to what you are expecting.
>>
>> --
>> Gregory (Greg) L. Snow Ph.D.
>> Statistical Data Center
>> Intermountain Healthcare
>> greg.snow at imail.org
>> 801.408.8111
>>
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>>> project.org] On Behalf Of Jim Bouldin
>>> Sent: Thursday, July 23, 2009 12:00 PM
>>> To: r-help at r-project.org
>>> Subject: [R] Random # generator accuracy
>>>
>>>
>>> Dan Nordlund wrote:
>>>
>>> "It would be necessary to see the code for your 'brief test' before
>>> anyone
>>> could meaningfully comment on your results. But your results for a
>>> single
>>> test could have been a valid "random" result."
>>>
>>> I've re-created what I did below. The problem appears to be with the
>>> weighting process: the unweighted sample came out much closer to the
>>> actual
>>> than the weighted sample (>1% error) did. Comments?
>>> Jim
>>>
>>>> x
>>> [1] 1 2 3 4 5 6 7 8 9 10 11 12
>>>> weights
>>> [1] 1 1 1 1 1 1 2 2 2 2 2 2
>>>
>>>> a = mean(replicate(1000000,(sample(x, 3, prob = weights))));a # (1
>>> million samples from x, of size 3, weighted by "weights"; the mean
>>> should
>>> be 7.50)
>>> [1] 7.406977
>>>> 7.406977/7.5
>>> [1] 0.987597
>>>
>>>> b = mean(replicate(1000000,(sample(x, 3))));b # (1 million samples
>>> from
>>> x, of size 3, not weighted this time; the mean should be 6.50)
>>> [1] 6.501477
>>>> 6.501477/6.5
>>> [1] 1.000227
>>>
>>>
>>> Jim Bouldin, PhD
>>> Research Ecologist
>>> Department of Plant Sciences, UC Davis
>>> Davis CA, 95616
>>> 530-554-1740
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> Jim Bouldin, PhD
> Research Ecologist
> Department of Plant Sciences, UC Davis
> Davis CA, 95616
> 530-554-1740
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list