[R] sample() issue

Peter Langfelder peter.langfelder at gmail.com
Mon Dec 20 20:39:08 CET 2010


On Mon, Dec 20, 2010 at 11:04 AM, cory n <corynissen at gmail.com> wrote:
>> length(sample(25000, 25000*(1-.55)))
> [1] 11249
>
>> 25000*(1-.55)
> [1] 11250
>
>> length(sample(25000, 11250))
> [1] 11250
>
>> length(sample(25000, 25000*.45))
> [1] 11250
>
> So the question is, why do I get 11249 out of the first command and not
> 11250?  I can't figure this one out.

Let me make a wild guess:

> floor(25000*(1-.55))
[1] 11249
> 25000*(1-.55)
[1] 11250
> 25000*(1-.55) - 11250
[1] -1.818989e-12
> 25000*0.45 - 11250
[1] 0


I'm guessing that the machine representation of 0.55 is off by
something like 1e-16, which gets
multiplied by a lot (25000) and this is enough for the floor (or
whatever rounding the internal code uses) to make it 11249. The morale
of the story is do not multiply non-exact numbers by huge constants or
you may get small inaccuracies.

Peter



More information about the R-help mailing list