[R] protentially serious R error

(Ted Harding) Ted.Harding at wlandres.net
Sat Dec 1 00:26:45 CET 2012


[See in-line below]

On 30-Nov-2012 21:05:23 liuxf wrote:
> Hi guy,
> I have recently encountered a problem while  I was just trying to generate
> some random numbers with the function "rnorm", the problem is shown below:
>########case 1############
>> rnorm(20*0.2)
> [1] -1.2765922 -0.5732654 -1.2246126 -0.4734006
>########case 2###########
> *> rnorm(20*(1-0.8))
> [1] -0.62036668  0.04211587 -0.91092165*

The key to this case can be seen in:

  20*0.2 - 4
  # [1] 0

  20*(1-0.8)-4
  # [1] -8.881784e-16

  0.2 - (1-0.8)
  # [1] 5.551115e-17

So you have fallen victim to the fact that, in general, floating-point
arithmetic is not exact (and this is not a "feature" only of R, but
of other packages that use standard floating-point CPU arithmetic.

>#########case 3############
>> a<-0.2
>> rnorm(20*a)
> [1]  0.1580288 -0.6545846  1.7672873  0.7167075
>#############case 4#########
> *> b<-1-0.8
>> rnorm(20*b)
> [1] 0.9101742 0.3841854 1.6821761*
> 
> I was expecting the 4 cases should do the same job--generate 4 random
> numbers. But in case 2 and 4 I only get 3. Has anyone else seen this problem
> before? Thanks. (I have tried with other functions i.e "rchisq","rexp" ...)
>#######################################################################
> One of my colleague also have a problem that we think it might be related
> with the problem I addressed above:
> 
>> test1 <- runif(10,0,1)
>> test1
>  [1] 0.3868379 0.1587814 0.8140483 0.7796691 0.5357628 0.2431110 0.1782747
> 0.3906829 0.5262615 0.7440143
>> test2 <- NULL
>> for(i in seq(0.01,1,length=100)){
> +   test2[i*100] <- sum(test1<i) 
> + }
>> test2
>   [1]  0  0  0  0  0  0 *NA*  0  0  0  0  0  0  0  0  1  1  2  2  2  2  2  2
> 2  3  3  3  3  3  3  3  3  3  3  3  3  3  3  4  5  5  5  5  5
>  [45]  5  5  5  5  5  5  5  5  6  7  7  7  7  7  7  7  7  7  7  7  7  7  7
> 7  7  7  7  7  7  7  8  8  8  9  9  9  9 10 10 10 10 10 10 10
>  [89] 10 10 10 10 10 10 10 10 10 10 10 10
> 
> Every time he re-runs the code there always always a "NA"(highlighted). Does
> any one know why?  Your help is greatly appreciated.   
> 
> Xiaofeng

This is effectovely the same issue:

  test1 <- runif(10,0,1)
  test1
  # [1] 0.3868379 0.1587814 0.8140483 0.7796691 0.5357628 0.2431110
  # 0.1782747 0.3906829 0.5262615 0.7440143
  test2 <- NULL
  for(i in seq(0.01,1,length=100)){
    test2[i*100] <- sum(test1<i) 
  }
  test2
  #   [1]  0  0  0  0  0  0 *NA*   0  0  0  0  0  0  0  0  1  1  2  2
  #  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  3  3  3  4  5  5
  #  5  5  5  5  5  5  5  5  5  5  5  6  7  7  7  7  7  7  7  7  7  7
  #  7  7  7  7  7  7  7  7  7  7  7  8  8  8  9  9  9  9 10 10 10 10
  # 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10

  S <- seq(0.01,1,length=100))
  100*S[7] - 7
  # [1] -8.881784e-16

  which(100*S < (1:100))
  # [1] 7

Why not simply use expressions which evaluate to exact integers?
There is no obvious reason in the examples given to do otherwise.

But, if you must adopt your approach, then consider:

  round(100*S)

  100*S - (1:100)
  #  [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
  #  [5]  0.000000e+00  8.881784e-16 -8.881784e-16  0.000000e+00
  #  [9]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
  # [13]  0.000000e+00  1.776357e-15  1.776357e-15  0.000000e+00
  # [17]  0.000000e+00  3.552714e-15  0.000000e+00  0.000000e+00
  # [etc .... ]

  round(100*S) - (1:100)
  #  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  # [26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  # [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  # [76] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Hoping this helps,
Ted.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 30-Nov-2012  Time: 23:26:41
This message was sent by XFMail




More information about the R-help mailing list